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OfOfCTrVT  MfTHODS  f OR  Of  VROf’IfJG  INOICrS  Of  PILOT  WORKLOAD 


I.  Introduction. 


Itie  performance  of  tt)C  pilot  as  an  aviation  subsystem  is  conditioned  by 
a larqe  numf)er  of  different  factors,  such  as  training,  selection,  and 
ptiysical  condition  (on  the  personttel  side)  and  the  extent  to  i*hich  the 
principles  of  fiuman  engineering  are  applied  to  the  design  of  the  system  and 
the  mission  profiles  (on  the  hardware  and  operational  side),  for  our 
purposes,  we  will  assutrw  that  appropriate  attention  has  been  allocated  to 
personnel  factors.  Thus,  our  primary  focus  will  be  on  those  aspects  of 
human  engineering  (fiuman  factors,  ergonomics)  that  relate  to  pilot  workload 
as  it  may  be  affected  by  the  overall  design  of  the  system. 

Tfie  concept  of  workload  is  of  special  interest  in  that  there  is  abundant 
evidence  (at  least  of  an  anecdotal  nature)  that  workload  can  be  a "go/no  go" 
modifier  of  the  performance  of  the  pilot  as  a functiorial  subsystem, 
especially  under  emergency  conditions.  Therefore,  finding  or  developing  an 
appropriate  mf;tfiodology  tfiat  yields  rel  iable  and  valid  measures  of  pilot 
workload  is  a goal  that,  if  achieved,  should  lead  to  important  gains  in 
safety  and  mission  accomplishment  through  the  resultant  system  design  and 
[)rocedural  modifications. 

Our  ultimate  concern  in  the  measurement  of  workload  must  be  the  determi- 
nation of  tfie  manner  and  extent  that  workload  affects  the  probability  of 
mission  success.  Ttius,  in  this  context,  it  is  appropriate  to  raise  the 
traditional  engineering  guestions  related  to  the  probability  of  "failure"  of 
the  pilot  as  a functional  sybsystem.  From  the  point  of  view  of  reliability 
engineering,  we  might  say  as  a first  approximation  that  an  acceptable  level 
of  workload  for  a given  phase  of  a mission  would  be  characterized  by  a set  of 
system- induced  (system  in  its  broadest  sense)  task  demands  such  that  the 
probability  is  equal  to  or  greater  than  some  specified  value  that  the  pilot 
will  be  able  to  satisfy  those  demands  and  successfully  complete  that  mission 
phase  without  compromising  subsequent  mission  phases.  (Clearly,  the 
probability  value  selected  for  one-time,  high-priority  missions,  for  multiple 
missions,  af)d  for  routine  operations  would  likely  be  different.) 

The  literature  in  this  area  is  quite  clear  on  one  point.  There  is  no 
generally  accepted  definition  of  the  term  "workload."  Some  authors  would 
use  the  term  primarily  to  refer  to  input  loading;  e.g.,  the  number  and  nature 
of  the  displays  (and  controls)  that  must  be  used  by  the  pilot  in  performing 
his  Job.  Others  would  use  the  term  to  refer  to  how  hard  the  pilot  has  to 
work;  these  authors  tend  to  prefer  biomedical  and/or  subjective  indices  of 
workload.  Still  other  authors  emphasize  those  aspects  of  workload  that 
relate  to  performance;  e.g.,  speed  and  accuracy  of  response.  i •' 


A.  A \^orkin(j  ()<‘firution  of  Work  lo, id.  th  ,if  tempt  will  he  m.ide  to  .irrive 
jt  d formril,  rompreliens i v(‘  definition  of  worklo.id;  the  problems  in  dev(-lopin(j 
suet)  d definition  dre  nurwrous  und  formiddhle.  However,  it  seems  necessdry 

to  offer  some  sort  of  workinci  d(>f  ini  t i on--even  ttiou()h  it  tie  rutlier  nofispecific 
dtid  Idrqely  deseript  i ve--of  tlie  w.iy  tti<*  term  will  tie  used  tiere  before  mean- 
inqful  discussion  of  rwdsurement  met  tiodolocjy  in  ttie  area  can  lie  undertaken. 
Therefore,  for  ttie  purposes  of  ttiis  paper,  level  of  [lilot  workload  will  he 
assumed  to  tie  an  tiypoUiet L ca  1 concept  that  is  determined  by  or  (if  you 
prefer)  related  to  ttie  aqqreqate  of  ttie  task  dem.inds  placed  on  ttie  pilot  by 
ttie  system  during  some  relatively  short-duration  mission  or  ptiase  of  a 
mission  couplefl  with  ttie  actions  required  of  the  pilot  to  satisfy  ttiose  task 
demands.  The  actions  required  may  he  overt  or  ttiey  may  he  covert.  They  may 
he  physical,  ttiey  may  he  mental,  ttif*y  may  tie  perceptual,  they  may  he  oral,  or 
ttiey  may  he  some  comliination  of  any  or  all  of  ttu'se.  There  may  be  ; Tposes 
for  wtiicti  it  is  appropriate  to  talk  about  system  demands  independent  of  pilot 
actions  in  considering  workload.  However,  in  ttie  present  discourse  it  will 
be  assimed  ttiat,  to  ttie  extent  a system  demand  is  not  followed  by  suitatile 
and  timely  action  on  ttie  part  of  tti<>  pilot,  the  mission  ptiase  will  have  been 
completed  in  less  than  an  acceptable  manner  (if  it  is  coirnileted  at  all).  In 
ottier  words,  demands  that  do  not  require  action  (eittier  overt  or  covert)  are 
not  rea  1 1 v demands;  and  actions  that  are  initiated  for  reasons  ottier  ttian  to 
satisfy  a system  demand  (and  are  potentially  disruptive  of  mission  accom- 
pli stinKuit  ) stiould  tie  eliminated  tiy  trairiiru;  and  operating  procedures.  Thus, 
"stimulus"  arid  "response"  will  not  be  treated  separately. 

Alttiouqti  for  purposes  of  exposition  a general  d(“finition  of  workload  is 
adopted,  it  sliould  be  clearly  understood  ttiat  ttie  (joals  and  intents  of  a 
given  measurem<int  effort  are  ttie  important  determiners  of  tiow  workload 
should  be  defined  and  wtiat  mettiodology  stiould  lie  adopted  foi  a specific 
application,  for  example,  on«>  desi(jner/researctu*r  may  need  to  know  simply 
wtiich  of  two  alternative--tiuf  ottierwise-  sal  isf<ictory--sinqle-purpose  displays 
makes  a smaller  contribution  to  ttie  pilot's  workload.  Anottier  designer/ 
researctier  may  need  to  know  tiow  quickly,  if  at  all,  tti«>  pilot  can  manually 
operate  a device  ttiat  is  normally  liydraul  ical  ly  or  electrically  powered. 
Numerous  other  differences  in  purposes  and,  lu-nce--liy  impl  i cation--method- 
logles  can  be  readily  imagined.  More  will  tie  said  on  this  topic  later,  but 
it  is  not  our  intent  to  tie  dogmatic--esp<-cial  ly  atiout  unsettled  issues. 

B.  Outline.  Ttie  remainder  of  ttiis  text  will  consist  of  six  sections: 

Some  Rudiments  of  Measurem<-nt  Itieory;  Lalioratory  Mettiods;  Analytic  and 
Synthetic  Methods;  Simulation  Mettiods;  In-lligtit  ffc^tliods;  and  Oiscussion, 
Recommendations,  Cautions,  and  Conclusions.  Ttie  approach  that  will  be  used 
in  the  researcti-oriented  sections  will  tie  to  describe  selected  programs  in 
wtiich  particular  mettiodoloqies  tiave  tieen  applied,  and,  wtiere  appropriate, 
data  will  tie  presented  to  give  an  indication  of  the  kinds  of  results 
actilev<;d.  No  attempt  at  a comprehensive  review  will  lie  made;  for  reviews  of 
relevant  areas,  the  readt-r  is  directed  to  the  following  reports: 

Gerattiewofi  1 (?S)  has  written  a concise  evaluation  of  the*  literature  on 


workload  definition  and  measurement;  Gartner  and  Murphy  (?3)  survey  and 
critique  ttie  literature  on  pilot  workload  and  fatique;  Wtiite  (bb)  reviews 
task  analysis  methods  in  relation  to  workload  specification;  dahns  (2'J) 
provides  <i  qeneral  review  ar\d  evaluation  of  the  literature  on  operator 
workload;  and  Spyker,  Stacktiouse,  Ktiahifal  la,  and  McLane  (!>())  review  ttie 
workload  literature  in  relation  to  quanti ta t i ve , subjective  and 
pt\ysioloqical  methods. 

1 1 . Some  Rudiments  of  Measuremet\t  Tiieory. 

Ttiis  sectiof>  is  not  in  any  way  intended  to  be  a definitive  exposition  on 
measurement  ttieory.  However,  certain  l)asic  concepts  of  measurement  ttieory 
will  conx>  up  ir>  later  sec:tions  and  it  S(;ems  expedient  to  fiK.-ntion  and  t)riefly 
explain  tttem  before  proceedinq.  (Some;  readers  may  wisti  to  skip  this  section.) 

A.  Val  i di  t y . itie  first  and  perhaps  most  important  notion  to  l)e  dealt 
witti  is  val idi ty . Ultimately,  ttds  simply  means,  "Are  we  really  measuring 
wt<at  we  intend  to  be  measurirtq?"  Ttie  answer  to  ttiis  question,  in  the  most 
precise  use  of  ttie  term,  assumes  the  existence  of  a criterion,  for  example, 
in  ttie  fic'ld  of  selection,  we  migtit  want  to  select  only  ttiose  aviation 
candidates  wtio  tiave  a tiiijli  probaliilily  of  completiri(|  fliqtit  training;  our 
criterion,  ttien,  would  be  succ(*ssful  completion  of  training  (and  perhaps 
final  aver.Kje  grade).  Ttu'  validity  of  ttie  selection  mtiasure  would  ttius  be 
determined  tiy  ttie  accuracy  witti  wtiicti  it  predicts  wtiicti  trainees  will 
graduate.  Unfortunately,  in  ttie  workload  area  we  have  no  sucti  criteria,  and, 
therefore,  we  must  rely  primarily  on  wti.jl  is  called  "content  valid! ty"--whicti 
really  .imounts  to  expert,  professional  opinion.  Still  anottier  kind  of 
validity,  "face  validity,”  can  be  important  in  motivating  test  subjects;  in 
this  sense,  (face)  validity  nvaris  ttie  test  situation  appears  to  lie  like  the 
job  of  ttie  pilot.  (No  sm.ill  part  of  ttie  expense  of  liui  Idirig  simulators  is 
devoted  to  trying  to  .ictiieve  face  validity.) 

II.  tie  1 iatii  1 i t y . He  1 iatii  1 i t y tias  several  meanings  Itiat  are  applicable 
In  varying  degrees  to  ttie  protilem  of  workload  measurement.  In  one  use,  it 
refers  to  ttie  enqineeriru)  ctniracteri st i cs  of  ttie  measurement  system  and 
relates  to  ttie  repeatatii  1 i ty  of  a rwasure  or  ptienomenon;  witti  a constant 
known  input,  wtiat  is  ttie  variatiility  of  ttie  output?  Ttiat  is,  tiow  accurately 
can  ttie  output  tie  predicted  from  ttie  input?  Heliability  in  this  sense 
involves  internal  ctiar.icterist ics  of  ttie  test  de-vice,  and  ttie  term  is  used 
to  reflect  ttie  sensitivity  of  .i  measurement  procedure  to  temperature  changes, 
drift  characteristics  of  com;ionents,  etc.  A second,  closely  related  use  of 
ttie  term  "reliatii  1 ity"  depends  not  only  on  ttie  aliove  ctiaracteristics  of  ttie 
test  equipment  but  also  on  ttie  tiuman  betiavior  being  measured,  for  example, 

In  even  ttie  most  carefully  controlled  exper iiTHnital  situation,  ttie  response 
latency  of  ttie  tuiman  sutiject  to  ttie  onset  of  a liqtit  will  stiow  variation 
across  trials  and  across  individuals;  ttie  amount  of  sucti  variation  will 
depend  on  ttie  betiavior  being  measured.  In  ttiis  use  of  the  term,  an 
approxinwtion  of  ttie  reliatiility  estim.it e can  l>e  obtained  by  observing  the 


extent  to  which  a group  of  individuals  shows  th(?  same  rank  ordering  on  cacti  of 
two  measurements  of  ttie  phenomenon  per  individual.  Ttiis  is  generally 
referred  to  as  test-retest  reliability.  It  should  he  noted  ttiat  ttie  apparent 
reliability  (i.e.,  the  size  of  the  reliability  coefficient)  is  dependent  on 
both  the  true  reliability  of  the  test  or  eguipment  used  and  ttie  existence  of 
stable  individual  differences  in  the  behavior  t)eing  me-asured.  Thus,  with 
higtily  trained,  tiigtily  selected,  skilled  operators,  the  variability  for  a 
given  individual  from  trial  to  trial  may  be  as  great  as  the  variability 
across  individuals  on  a given  trial.  Under  such  conditions,  the  measured 
reliability  could  appear  to  be  rather  low  even  though  ttie  basic  measures  are 
guite  stable.  In  any  case,  if  meaningful  comparisons  are  to  be  made 
concerning  workload  variations,  some  estimate  of  the  stability  and  precision 
of  the  measures  must  be  secured.  Otherwise,  there  is  no  way  to  determine 
whether  an  obtained  difference  in  a measure  is  properly  interpreted  as  being 
real  or  as  being  a result  of  chance  factors. 

C.  Sensitivity.  In  any  evaluation  of  alternative  system  designs  or 
system  operating  procedures,  it  is  necessary  to  have  some  index  of  the 
sensitivity  of  the  measures  to  the  variables  being  manipulated.  For  example, 
simple  reaction  time  to  an  attention-getting  signal  calling  for  a single 
response  is  guite  stable  even  when  there  are  large  changes  in  presumably 
important  variables.  The  same  is  true  of  many  simple  tracking  tasks.  Perhaps 
the  main  reason  for  this  stability  is  the  extreme  adaptability  of  the  human 
op<*rator.  If  the  operator  is  confronted  with  a task  situation  in  which  he 
can  concentrate  all  of  his  resources  on  the  performance  of  the  task,  then,  at 
least  for  relatively  short  intervals,  he  can  maintain  his  performance  of 
single  tasks  amazingly  well.  Thus,  for  example,  if  altitude  were  a variable 
of  interest  and  simple  reaction  time  were  the  measure  used,  we  would  conclude 
that  performance  is  not  impaired  until  the  pressure  altitude  is  somewhat  in 
excess  of  5,000  meters.  Thus,  such  simplistic  approaches  could  lead  to 
guestionable  conclusions.  What  all  of  this  means  is  that  it  is  sometimes 
necessary  eittier  to  do  preliminary  research  or  to  add  variables  to  the  main 
research  simjily  to  get  an  index  of  the  sensitivity  of  the  measurement 
procedure  to  relevant  variables. 

0.  Magnitude  of  Effect.  If  two  alternatives  (displays,  for  example)  are 
exactly  eguivalent  in  terms  of  cost,  weight,  size,  etc.,  then  any  reliable 
(statistical ly  significant)  superiority  of  one  alternative  over  the  other  is 
sufficient  basis  for  choosing  the  better  alternative.  However,  if  there  are 
im|)ortant  differences  between  the  two  in  terms  of  cost,  weight,  etc.,  then  it 
is  necessary  to  establish  not  Just  the  statistical  significance  of  a 
difference  (if  there  is  one)  but,  especially  if  the  more  expensive  one  is  the 
better,  how  much  better  it  must  be  to  make  in  fact  a practical  difference, 
f Xpert,  professional  Judgment  plays  a major  role  here. 

Ill,  Laboratory  Methods. 

from  the  point  of  view  of  methodology,  there  are  three  cTiaracteristics  of 
"laboratory"  methods  that  make  them  highly  desirable,  first,  for  most 
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l.it)or<ilory  t.isks,  it  is  possible  to  exf'rcisc  very  precise  control  over  ttie 
perform.ince  demai\()s  imposed  on  the  operator.  One  can  with  relative  ease 
control  the  ruimher  of  tasks  ttiat  are  active,  the  rates  <it  wtiicti  sirpials  are 
presented,  arui  ttu*  timiru)  of  ttie  siqt\als  oi\  individual  sifjnal  sources  as  well 
as  across  sources.  5»'Cond,  "exact"  duplication  of  test  ()rocedures  is  r«-adi  ly 
achieved.  Third,  laboratory  mettiods  in  qeneral  can  provide  tt\e  tiifjtiest 
|)recision  of  measurement  ttiat  one  is  likely  to  actiieve  in  the  realm  of 
oper.itor  tietiavior.  fourth,  d(>()cndi ncj  on  the  level  of  complexity  of  the 
e xper  inx'nt  a 1 task  structure,  tiicjti  ecjuipment  reliatiility  is  possitile  at 
rel.itively  modest  costs,  and,  tiecause  ptiysical  safety  is  not  involved,  any 
lack  of  riK-chanical  or  electrical  reliatiility  is  primarily  just  a source  of 
inconvenience.  In  addition,  t.isks  can  be  selectetJ  and  structured  so  tti.it 
(jood  test-retest  reli  <ibi  1 i t ies  .ire  common.  And  fiftti,  it  is  (jenerally  not 
terribly  ilifficult  to  establisli  ttie  sensitivity  of  ttie  task  me.isures  to 
v.iri.itiles  of  known  operational  import. ince  and  betiavioral  potency. 

A.  tiackijroiind  ITesearcti.  1 .irly  in  ttie  tiistory  of  ttie  behavioral 
sciences,  ttiere  was  consi derail le  interi’St  in  ttie  area  of  mental  load  in  wtiat 
would  now  lie  called  an  inforirvition  processinq  context.  1 tiese  early  efforts 
were  directed  at  an  attempt  to  tireak  down  complex  reaction  tim<'  into  its 
constituent  components.  To  illustrate  tmw  this  lireakdown  was  approactied, 
assume  ttiat  ttu;  operator  is  confronted  wilti  a red  liqht  on  ttie  riqtit  of  a 
display  and  a qreen  liqtit  a few  centinK'ters  to  its  left.  Assume  furttier 
ttiat  two  response  liuttons  are  convenient  ly  located  for  ttie  use  of  ttie  riqlit 
tiand.  Ttie  sutiject  is  instructed  to  depress  ttie  riqtitirxist  tiutton  if  ttie  red 
liqtit  comes  on  and  ttie  left  button  if  ttie  qreen  liiilit  comes  on.  Ttius,  ttie 
sutiject  must  decide  wtiicti  liqtit  cam<‘  on  .ind  wtiicti  tiutton  is  correct.  Assume 
now  a (tifferent  procedure:  a numtier  of  responses  are  recorded  in  whicti  only 

ttie  red  liqht  and  ttie  riqtitmost  button  are  present  .irid  otlier  responses  wtien 
only  ttie  qreen  liqtit  and  the  leftmost  button  .ire  present.  Witli  ttiis 
procedure,  ttie  subject  only  ti.is  to  tiecome  aware  ttiat  a liejtit  is  on  and 
respond.  The  notion  is  ttiat  ttie  difference  tietwei-n  ttie  averaqe  response  time- 
to  ttie  single- liqtit/sinqle-tiutton  conditions  and  ttie  two-liqlit /two-button 
condition  provides  an  estimate  of  ttie  "mental"  processinq  timt*  in  recognizing 
wtiettier  ttie  red  or  ttie  gn;en  liqtit  tias  lieen  illuminated  in  ttie  latter 
condition.  This  qeneral  procedure  lias  lu-en  expanded  and  permutated  in  a 
variety  of  ways.  The  wel  1 -establ istied  result  is  ttiat  if  N signals  are 
uniquely  coordinated  to  N possible  responses,  tlien: 

Reaction  Time  = a + ti^log^  M 

where  a is  ttie  y-axis  intercept,  b is  ttie  slope  constant,  and  log-  N is  the 
measure  of  information  "H."  Ttius,  it  is  seen  in  tliis  very  elementary  case 
ttiat  performance  is  a function  of  task  demand  or  workload. 

n.  Tlminq--Stieed  and  Load  Stress.  Anottier  line  of  lalioratory  research 
tias  tieen  concerned  witti  the  timing  of  response  in  a monitoring  situation. 

Ttie  notion  of  timing  in  ski  lied  perfornuince  was  first  introduced  by 


Sir  Predrick  Hartlrtt  (4).  Ihr  concept  w<is  further  refined  t)y  Conraf)  (IB), 
who  proposed  to  define  timinq  (of  responses)  as  "ereatinq  the  most  favorat)le 
temporal  conditions  for  response."  Conra(i  treated  load  in  his  studies  as 
being  a function  of  the  numl)er  of  siqnal  sources  and  consider(>d  1 pad  stress 
to  be  produced  by  increasinq  that  numiier  beyond  some  value.  He  used  the 
term  speed  stress  to  refer  to  excessive  rates  of  presentation  of  siqnals  from 
a qiven  source  (or  number  of  sources).  Conrad  found  that  subjects  tendetl  to 
alter  the  point  of  response  initiation  in  a manner  apparent ly  desiqned  to 
even  out,  temporally,  the  sequence  in  which  they  were;  required  to  take 
action.  In  a later  study,  Conrad  (19)  qave  subjc'cts  limited  control  over  the 
averaqe  ‘ at  which  signals  would  appear;  this  control  qave  subjects  the 
oppor'  to  slow  down  the  siqnal  rate  so  they  could  successfully  respond 

to  f ly  concurrent  siqnals  on  separate  displays;  on  the  averaqe, 

sul  aetter  under  this  condition.  These  results  are  sucjqestive  of 

tt  lOiiity  of,  wherever  possible,  adopting  designs  and  operating 

pi euur'S  that  permit  latitude  in  the  exact  point  at  which  events  must  be 
initiated  by  aircrew  personnel. 

Knowles,  Garvey,  and  Mewlin  (34)  investigated  speed  and  load  effects  in 
a different  context;  they  were  interested  in  displ ay-cont rol  compat ibi 1 i t y 
relationships.  The  part  of  their  experiment  that  is  of  particular  interest 
here  is  the  comparison  of  a 10x10  matrix  of  lights  (associated  with  a 10x10 
matrix  of  response  buttons)  and  a BxB  matrix  of  lights  (associated  with  a 
5x5  matrix  of  buttons).  Tfie  rate  of  presentation  of  information  ^not 
signals)  was  equalized  across  the  two  conditions;  the  r.ites  used  were  1.75, 
?,Pb,  ^.75,  and  3.0  bits/second.  They  foui\d  that  tt>e  effect  of  load  (display 
size)  had  a greater  effect  on  error  rate  than  did  rate  of  presentation  of 
siqnals.  (See  Table  1.)  They  also  found,  incidentally,  that  subjects  could 

Table  1.  Mean  1 rrors  Per  100  Stimuli* 

Speed  (bits/s) 

Matrix  1.75  ?.25  ?.75  3.0 

Small  (5x5)  ?.5  3.6  4.1  7.1 

Large  (10x10)  3.6  10.8  13.1  15.8 

•Adapted  from  Knowles,  Garvey,  and  llewlin  (34). 

respond  at  an  averaqe  rate  of  0.45  signal  per  second  without  errors  in  a 
self-paced  mode  whereas  when  the  task  was  forced-paced  at  that  same  rate, 
subjects  made  36  percent  errors. 
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C.  St‘con(1dry  1 oaditu]  Tasks.  One  fjeneral,  more  cilrect  approach  to  the 
study  of  workload  in  Uic  lat)oratory  has  t)(?en  through  ttie  use  of  secondary  or 
loading  tasks.  Knowles  (16)  siimmar i /es  early  work  of  tlo's  sort  and  provides 
the  general  rationale  for  the  application  of  the  technique  to  workload 
measurement  in  a part-task  simulation  context.  Knowles  (page  11>6)  states 
that  auxiliary  tasks  are  used  ".  . . with  the  intention  of  finding  out  how 
mud)  adiJitiofial  work  ti)e  operator  can  undertake  while  still  performing  the 
primary  task  to  meet  system  criteria. 

"Secondary  tasks  are  useci  because  primary  part- task  performance  measures, 
in  and  of  themselves,  seldom  reflect  operator- load.  . . . they  seldom  tell 
the  price  paid  in  operator-effort  in  meeting  (the  system)  criterion." 

Knowles  goes  on  to  describe  an  earlier  study,  Knowles  and  Rose  (3S),  in  which 
a simulated  lunar  landing  task  was  being  investigated,  lie  says  that  in  that 
study:  "The  loading  scores  were  sensitive  to  differences  in  problem 

difficulty;  they  reflected  increased  ease  in  handling  the  control  task  as  a 
function  of  practice;  they  revealed  differences  in  workload  between  members 
of  a two-man  crew;  and  they  showed  that  the  particular  control  law  under 
consideration  was  unsatisfactory  because  of  the  extremt?  buildup  of  operator 
load  during  the  last  few  seconds  of  ttie  landing.  None  of  these  results  was 
available  from  system  performance  criteria;  i.e.,  tirm*,  fuel,  miss-distances." 
(rmphasis  added.)  The  basic  ap[)roach  in  this  nu^thod  is  to  compare  the  levels 
of  performance  achieved  on  the  "loading"  task  when  performed  alone  with  the 
levels  achieved  when  it  is  perfonrK>d  in  combination  with  the  primary  task; 
this  difference  is  said  to  provide  an  index  of  the  workload  imposed  by  the 
primary  task. 

Befison,  Huddleston,  anrl  tlol  fe  ib)  reporteci  a study  in  which,  among  other 
things,  they  evaluated  a one-dimensional  tracking  task  by  using  two  altitude 
displays;  performance  was  measured  with  each  display  with  and  without  a 
secondary  light-acknowledgment  task.  They  found  a small,  consistent 
superiority  of  a counter-pointer  displ.iy  over  a counter-only  display  with  the 
tracking-only  condition.  Wlien  the  secondary  task  was  added,  they  found 
sicjnificant  decrements  in  tracking  with  both  displays  with  a significant 
superiority  of  the  counter-pointer  over  the  counter-only  display.  The 
secondary  task  showed  significant  decrements  when  added  to  either  tracking 
task;  the  differences  between  display  conditions  were  fully  compatible  with 
the  findings  for  the  tracking  task--namely , the  display  that  showed  the 
better  performance  in  tracking  showed  the  lesser  effect  on  the  performance  of 
the  secondary  task.  They  interpret  the  decrements  in  the  primary  tracking 
task  to  pose  serious  questions  as  to  "the  essential  feature  of  the 
subsidiary  task  situation;  namely,  that  consistent  primary  task  performance 
is  possible  in  two  task  conditions."  Benson  £t  {b)  instructed  their 
subjects  that  they  were  to  attend  to  the  secondary  task  only  when  they  could 
properly  do  both  jobs  together.  They  interpret  their  results  to  suggest  that 
subjects  rrwiy  not  be  able  to  comply  with  such  Instructions  and  discuss  at  some 
length  whether  and  how  subjects  might  be  able  to  perceive  that  their 
performance  is  being  maintained  on  the  primary  task.  They  also  suggest  the 


possibility  that  <i  coot  iiuioiis  primary  task  may  t)c  more-  likely  to  suffer 
decrements  ttum  a discrete  prim<iry  t.iol;.  Dependirirj  on  the  fre()uency 
ctiar.K'teristics  of  tlie  display  disturb.mces  and  the  time  it  takes  ttie  subject 
to  perceive  wtiieti  liqtit  tias  been  i 1 1 umi  n.ited , it  is  quite  reasoriat)le  to 
expect  tb.it,  on  a prob.il)  i 1 i s t i c Ijasis,  lookitu)  at  and  respondint)  to  ttieir 
secondary  t.isk  would  encotjracjc  error  accumtjl.it ion  oti  ttieir  primary  task. 

It  stiould  lie  noted  that  Benson  et  <il.  ( 'j ) concluded  th.it  "there  is  ru; 
doulit  tti.it  tlie  presence  of  ,i  second  t.isk  .ittded  to  tlie  v.ilue  of  tti<> 
experiment.  . . ."  Itius,  their  discussion  of  ttie  clianqes  in  the  primary  task 
is  related  prim.irily  to  "ttieoretica  1"  expttctat  i ons  .iS  to  tiow  ttie  second.iry 
task  tect'.nique  stiould  o()er.ite  in  pr.ictice.  It  could  tie  .irtjued  ttiat  ttieir 
experiment  .ictu.illy  demiiiist  r.ited  Iwii  imfiort.iiit  findings:  (1)  tta*  counter- 

pointer  display  is  tietter  in  tti.it  it  resulted  in  tietter  p<ir formance 
( riumi-r  i ca  1 ly  in  ttie  case  of  tr.ickiruj  only  <ind  statist!  cal  ly  in  ttie  c.ise  of 
ttie  two-t.isk  situ.ition);  .ind  (?)  ttie  count(‘r-on  ly  display  is  more  sensitive  to 
possitile  distr.iction  or  interference  from  otlier  tasks. 

Itie  (Question  can  .ilso  tie  r.iised  .is  to  whetlier  ttie  sutisidiary  task 
tectinique  neci'ssari  I y relies  on  ttie  sutiject's  actiievinq  parity  of  performance 
on  the  primary  task  between  ttie  one-  .ind  two-task  conditions.  Clearly, 

Bensori  et  al.  (t>)  dt'mons t rated  in  ttieir  experiment  tti.it  useful  information 
can  lie  otit.iined  from  tlie  tectinique  wtuni  ttiis  assumed  state  of  affairs  does 
not  otitain.  if  we  consider  one  of  tlie  empirically  tiased  reasons  ttiat  Knowles 
pointed  at  ifi  using  ttie  tectiniipie,  it  is  frequently  ttie  apparent  absence  of  an 
effect  on  siiujle  t.isks  of  possitily  import.int  v.iriatiles  ttiat  sucjqests  ttie 
possible  v.iliK-  of  using  secontliry  o|ierator  lo.idinq  tasks.  Itius,  it  could  lie 
argued  ttiat  so  long  .is  ctuirujes  in  ttie  primary  task  and  ttie  secondary  task  are 
compatitile  (i.e.,  le.id  to  ttie  same  conclusions),  we  stiould  not  lie  overly 
concerned  about  ctiam)«‘s  in  ttie  prim.iry  t ask--ctianges  ttiat  may  be  valuatile 
data  in  and  of  tliemselves. 

Senders  (49)  s.iys  ttiere  are  four  .issumplions  ttiat  underlie  ttie  secondary 
loadiruj  t.isk  rw-ttiodol  oijy : (I)  Ttie  oper.itor  is  a si  ngle-ctianne  1 system. 

(,’)  Ttie  ctiannel  tias  a fixed  c.ip.icity.  (1)  Tlie  capacity  tias  a single  metric 
by  wtiicti  <iiiy  task  c.in  tie  nx'.isured.  And  (4)  ttie  constituents  of  workload  are 
additive  linearly,  recjardless  of  ttie  sources  of  ttie  load.  Tliese  assumptions 
are  required  if  ctiannel  cap.icity  is  to  be  given  formal  status  as  ttiat  term  is 
used  in  information  ttieory . However,  in  ttie  practic.il  application  of  ttie 
secondary  lo.iding  task  metliodology , I tielieve  ttie  first  and  second 
assumptions  st.ited  tiy  Senders  are  of  major  significance'  only  under  certain 
conditions-- for  example,  wlien  neittier  tlie  prim.iry  task  performance  nor  tlie 
loading  task  performance  ctianejes  wlien  ttie  two  .ire'  performed  simult.ineously . 

In  that  event,  alttiouejti  we  would  tiave  leariu'd  somettiing  interesting  about  tlie 
two  t.isks,  we'  could  not  tit'  sure  wtiettier  ttie  primary  task  represents  a 
"no  load"  condition,  ttie  ope'rator  tias  employed  a prtiviously  "unuse'd"  channel, 
ttie  operator  tias  simfily  "exp.indeel"  tiis  (sinejle)  channel  capacity,  or,  wtiat  is 
most  likely,  ttie  time'  regui remeints  of  ttie  two  tasks  are  such  that  ttie 
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performance  of  neither  interferes  with  that  of  the  other.  The  possible 
absence  of  linear  additivity  places  a heavy  burden  of  responsibility  on  the 
choice  of  the  loading  task;  clearly,  the  loading  ! ’sk  must  have  properties  in 
the  "additivity  domain"  that  warrant  genera  1 i zati on  to  the  kinds  of  system 
tasks  that  might  be  coupled  with  the  primary  task  being  investigated.  By 
tlie  same  token,  the  metric  implied  by  the  secondary  task  must  also  be 
applicable  to  possible  system  task  reguirements. 

Perltaps  the  safest  interpretation  of  the  changes  in  the  secondary  task 
would  be  that  they  serve  as  afi  index  of  the  spare  time  that  the  operator  has 
while  performing  the  primary  task  at  criterion  levels.  R L even  in  this 
interpretation  it  is  necessary  to  make  some  kind  of  assumntion  regarding  the 
ease  of  back-and- forth  transition  (primarily  in  terms  of  'ime)  between  the 
primary  task  and  the  particular  secondary  task  being  used.  Rolfe  (42),  who 
provides  an  excellent  review  and  discussion  of  the  secondary  task  method  of 
measuring  workload,  closes  with  the  following  caution:  "The  final  word, 

however,  must  be  that  the  secondary  task  is  no  substitute  for  competent  and 
comprehensive  rtK;asurement  of  primary  task  performance.  The  technigue 
should  always  be  looked  upon  as  a means  of  gathering  additional  information 
rather  than  an  easy  way  of  gathering  primary  information."  This  caution 
sliould  not  be  taken  lightly,  even  though  the  study  of  Knowles  and  Rose  (3^) 
showed  secondary  task  measures  to  be  sensitive  to  important  factors  not 
revealed  by  the  primary  task  measures. 

D.  Cross-4dat)tive  Loading  Tasks.  Kelley  and  Wargo  (31)  take  the 
position  that  consistent  performance  on  the  primary  task  is  a must.  They 
offer  data  from  a demonstration  experiment  using  two  subjects  in  which 
decremerits  on  primary  and  secondary  tasks  are  apparently  not  compatible; 
conditions  that  were  ranked,  in  order  of  merit.  A,  B,  C on  the  primary  task 
were  ranked  B,  A,  C by  measures  from  a secondary  task.  Their  primary  task 
was  a two-dim<‘nsional , two-display  compensatory  acceleration  tracking  task; 
ttie  secondary  task  consisted  of  two  identical  "warning"  lights,  one  above  the 
other,  located  where  subjects  could  see  them  by  peripheral  vision  but  had  to 
look  at  them  directly  to  determine  which  light  had  been  illuminated;  response 
to  the  lights  was  made  with  a thuml)  switch  located  on  the  tracking  control 
stick.  When  the  lights  task  was  active,  one  of  the  lights,  selected  at 
random,  would  turn  on  0.44  second  after  the  subject  extinguished  the  previous 
light.  The  primary  task  variable  of  interest  was  display  gain,  of  which 
there  were  three  levels.  Three  test  conditions  were  used:  primary  task  only, 

primary  task  plus  the  loading  task  with  independent  programing  (straight 
subject  pacing),  and  primary  task  plus  "cross  adaptive"  programing  of  the 
loading  task.  In  this  latter  case,  as  long  as  tracking  error  (vector 
root-tm*an-sguare  (RMS))  remained  below  the  criterion  level,  one  of  the  lights 
would  be  turned  on  as  noted  above.  If  error  exceeded  the  criterion  level, 
the  lights  task  would  be  deactivated  until  tracking  error  again  was  below 
criterion.  It  is  important  to  note  that  Kelley  and  Wargo  (31)  instructed 
th»*lr  subjects  to  perform  both  tasks  "...  as  well  as  they  could  and  not  to 
neglect  one  for  the  other."  Thus,  the  concepts  of  primary  and  secondary  are 


somewhat  t)lurred;  the  e\perim<‘nter,  vvitlioiit  informitui  the  siihjec'ts,  h.jd 
art)itrarily  (leeided  whiefi  was  whic-h.  Hte  previously  mentioned  findirif/s  from 
Kelley  and  Wanjo,  in  which  the  inferences  from  ttu'  |)rlmciry  and  secondary  task 
performances  were  not  compatible,  w('re  takf'o  from  tlie  condition  ifivolvinq 
tracking  phis  ttie  sub ject -p.iced  loading  task.  Tlie  compellingness  of  their 
results  suffers  from  several  problems.  First,  only  two  subjects  were  used. 
Secomi,  ttie  display  g.iin  variable  was  significant  for  the  tracking-only 
condition.  Third,  the  display  gain  variable  w.is  siffniflcant  for  the 
sub  ject -paced  loadincj-task  conditioti  for  one  suliject  though  not  for  ttie 
ottier.  And,  fourtti,  <i  cleaner  evaluation  of  the  cross-adapti ve  approacFi  to 
using  loading  tasks  would  have  resulted  if  task  priorities  tiad  been 
manipulated  througti  instructions,  or  whatever.  Howi'ver,  the  approacti, 
overall,  looks  interesting  and  further  evalu.ition  of  its  characteristics 
vis-a-vis  traditional  loading-task  procedures  would  .ippear  to  be  warranted. 

f . Memory  ScanniiK;  Tasks.  Another  variatiori  on  ttie  secondary  task 
tectinique  tias  been  described  by  O'Donnell  (41).  This  procedure  is  "an 
ada()tation  of  an  item  recognition  technique  first  described  by  Sternberg" 
(S1,S?).  Ttie  basic  approach  is  ttmt  ttie  operator  is  required  to  learn  a set 
of  positive  stimuli  (so-called  because  their  appearance  calls  for  a positive 
response).  Members  of  ttie  positive  set,  frequently  letters  of  tlie  alptiabet, 
are  presented  one  at  a time;  generally,  on  half  of  the  trials  the  stimulus  is 
a rm-mber  of  a neijative  set.  On  the  appe.irance  of  a letter,  the  operator  is 
instructed  to  respond  as  quickly  as  possible  by  depressing  a "yes"  key  if  ttie 
letter  is  a member  of  ttie  positive  set  <ind  ,i  "no"  key  if  it  is  a momtier  of 
ttie  negative  set.  Under  .ippropriate  conditions,  a line.ir  relation  exists 
between  the  si/e  of  the  positive  set  (typic.  lly  1 to  8)  and  reaction  time. 

Ttie  psyctiologica  1 ttieory  behind  the  use  of  this  task  is  that  average  reaction 
time  witti  a given  number  of  stimuli  in  Uie  positive  set  can  be  broken  down 
into  ttiree  pa'ts:  (1)  stimulus  encoding,  (A)  nx-mory  scan,  and  (3)  response 

selection  and  execution.  For  given  set  of  conditions,  the  first  and  third 
parts  are  assufmid  to  be  constant.,  wtiereas  the  second  part  is  interpreted  to 
be  a direct  reflection  of  nx-mory  scan  speed  and/or  memory  load.  Thus, 
ctiariges  in  the  y-intercept  value  are  assumed  to  reflect  changes  in  the 
perceptual  and/or  response  aspects  of  th(>  task.  Changes  in  ttie  slope  of  the 
curve  are  assumed  to  reflect  changes  in  ttie  rate  at  whicti  memory  is  scanned 
and/or  ttie  amount  of  mt-mory  load  involved.  In  other  words,  ttie  y-intercept 
value  serves  ttie  same  function  as  a measure  from  a secondary  loading  task  as 
(iescribed  previously;  the  tiigtier  the  intercept  (i.e.,  tlie  longer  ttic' average 
response  time;),  the  greater  ttie  assumed  loading  produced  by  ttie  primary  task. 
In  addition,  a ctiange  in  ttie  slope  of  ttie  respofise-time  curve  might  be 
interpretable  as  a reflection  of  ttie  amount  of  memory  load  .iddi'd  by  ttie 
primary  task.  Ttie  value  of  tliis  task  as  <i  loading  task  in  ttie  usual  sense 
tias  been  borne  out  by  ttie  results  of  preliminary  studies  conducted  tlius  far. 
However,  ttie  possitii  lities  witti  respect  to  its  providing  a measure  of  mt'mory 
lo.ad  are  still  to  be  demonstrated.  It  stiould  tie  noted  ttiat  earlier  results 
reported  by  Darley,  Klat/ky,  and  Atkinson  i?P)  suggest  ttiat  tlie  addition  of 
memory  load  not  directly  related  to  ttie  item  recognition  task  does  not  affect 
ttie  slope  of  tlie  reaction  time  curve. 
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I . Synthetic  Work  Tasks.  Operator  workload  lias  also  received 
attention  in  an  area  of  ial)or.jtory  rese.-rt^h  tiiat  is  concerne<i  wilfi 
"synthetic  work."  The  ratioti.ile  for  the  deve  1 optTK-nt  of  synthetic  work  tasks 
has  heeti  described  in  detail  elsewhere  (13,14);  Imwe-ver,  for  those  readers  to 
whom  the  notior)  is  new,  ,i  brief  description  of  the  techniques  and  ptiilosoptiy 
will  be  given  tiere. 

The  poit»t  of  departure  of  ttie  synthetic  work  approach  is  a i)ehavi ora  1 
analysis  of  the  performance  rc(|ui  remef\t  s placed  on  the  operator  by  some 
particular  aviation  system  or  by  a class  of  such  systems  in  general.  Tasks 
are  ttien  selected  .ujainst  a criterion  of  content  validity  (i.e.,  tasks  are 
selected  b<‘causc  they  iwasure  functioriS  judejed  hy  experts  in  tiie  field  to  be 
important  to  aircrew  operations)  as  well  as  a gen(>ral  criterion  of  face 
validity  (i.e.,  ttu‘  t.isks  are  confif|ured  to  be  acceptable  to  target 
populations,  sucli  as  pilots).  Corisurm.-r  accf-pt <jn(?e  of  the  tasks  tias  always 
been  good  (13).  The  result.int  hardware  is  designed  so  that  ttie  selected 
tasks  can  be  presented  in  any  com!) i nat i on  desired  and  individual  tasks  can  be 
varief)  along  both  timt'  constraint  and  task  difficulty  parameters.  The 
origirial  goals  of  ttie  program  in  wtiich  the  particular  system  to  lie  described 
was  conceived  were  ttie  evaluation  of  procetlural  (e.g.,  work  schedules), 
environmental  (e.g.,  altitude),  and  pti<irmacolo(|ical  (e.g.,  alcohol)  variables 
as  ttiese  factors  migtit  affe<  t complex  performance. 

Wittiin  ttie  context  of  ttie  way  ttiese  tasks  were  developed  and  tiave  t)een 
used,  ttie  notioti  of  workload  is  a relative  concept.  However,  from  ttie 
beginning  it  was  assumed  ttiat  it  would  tie  desiralile,  if  not  necessary,  to 
vary  ttie  app.irent  workload  imposed  on  ttie  operator  from  very  licjtit  to  near 
overload;  overload  is  defined,  for  ttiis  purpose,  <is  decrements  on  all  or  most 
of  ttie  concu’-rent  ly  performe«t  tasks,  even  in  ttie  .itisence  of  any  external 
stressor.  tTius,  extensive  data  tiave  been  col  l<-cted  on  a variety  of  task 
comtiinations  ttiat,  on  a rationally  defensitile  basis,  would  tie  expected  to 
correspond  to  different  workloads. 

Ttie  specific  tasks  used  involve  monitoring  of  liijtits  and  meters 
(providing  measures  of  reaction  tiirx-),  mental  arittimetic,  pattern  discrimi- 
nation, elementary  problem  solving,  ,ind  two-dim<'nsional  comperisatory 
tracking.  Ttie  task  combinations  used  in  a study  tiy  Hall,  Passey,  and  tieiqhan 
(?a),  involving  an  earlier  version  of  wtiat  is  called  ttie  Multiple  Task 
Performance  battery  (Ctiiles  et  al.,  13),  are  shown  in  Tatile  ?.  tiote  ttiat 
two  basic  conditions  w<-re  examiried--monitoring  tasks  only  and  "full  battery" 
as  specified  In  Table  P.  If  it  is  assumed  ttiat  ttie  subjects  tended  to  treat 
ttie  monitoring  tasks  as  secondary  (loading)  tasks,  ttien  ttie  performance 
levels  on  ttiose  tasks  can  be  considered  to  be  an  index  of  the  workload 
imfiosetl  on  ttie  oper.itor  by  ttie  different  comtiinations  of  ttie  ottier  tasks, 
figure  1 stiows  the  response  latencies  on  a normalized  scale  for  the  responses 
to  ttie  offset  of  .my  one  of  five  green  ligtits  located  one  at  each  corner  and 
one  in  ttie  mlddh*  of  ttie  test  [i.inel.  figure  ?.  stiows  response  times  in 
secomfs  for  ttie  detection  of  a stiift  in  ttie  av«‘raqe  value  of  the  "randomly" 


Response  latency 
(normalized  scale) 


Table  2,  Performance  Schedule* 


Monitoring  Complex 

Only 

Auditory  Vigilance  X \ X X X X X X TTTTTTTT 

Warning  Lights  XXXXXXXX  XXXXXXXX 

Meter  Monitoring  XXXXXXXX  XXXXXXXX 

Mental  Arithmetic  X X 

Problem  Solving  X X X X 

(Group) 

Pattern  Discrim.  X X 

15«Mlnute  Interval  1?345678  1?345678 

•Adapted  from  Hall,  Passey,  and  Meighan  (26). 
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Figure  1.  Mean  response  latency  in  detecting  green  warning-light  signals 
during  each  15-mlnute  period  of  the  basic  2-hour  task  program. 
(Adapted  from  Hall,  Passey,  and  Meighan,  26). 

1? 


80 


Task  Proqrdm  I’eriod 

f igure  ?,  Mean  detection  tinw;  for  correct  (ietections  of  protiat) i 1 i ty 

monitoring  sigrials  during  each  l‘>-miftute  period  of  the  basic 
?-hour  task  program.  (See  Table  ?.) 

wanderifig  pointer  of  any  one  of  four  meters  located  across  ttie  top  of  the 
test  panel.  Each  of  these  figures  contains  two  curves--one  for  the  given 
monitoring  task  performed  with  only  the  monitoring  tasks  active  and  one  for 
monitoring  performance  as  a function  of  the  different  "active  task"  coml>ina- 
tions.  Mote  that  the  first  and  the  last  points  of  the  curves  labeled  "full 
battery"  consist  of  only  the  monitoring  tasks,  thus  providing  "anchor  points" 
for  the  curves.  The  normalizing  scale  applied  to  the  data  for  the 
green-lights  monitoring  tends  to  suppress  the  apparent  amj)litude  of  tfie  shift 
in  response  tim<*s,  but  the  changes  across  task  coml)inations  are  statistically 
significant.  The  changes  in  the  met«-r-moni toring  task  are  much  larger  and, 
of  course,  are  also  statistically  significant. 

The  data  shown  In  Figure  3 are  from  a later  unpublished  study  using  the 
task  schedule  slujwn  in  Table  3 and  using  pilots  as  the  subjects.  Figure  3 
shows  response  times  in  seconds  to  the  onset  of  red  lights  (physically 
paired  with  the  green  lights)  and  the  offset  of  green  lights.  Figure  3 also 
shows  the  detection  times  in  seconds  for  the  meter-monitoring  task.  (Although 
the  tasks  are  functionally  the  same  as  those  used  by  Mall  et  al.  (2<>),  the 
data  of  tlicse  two  figures  were  collected  l>y  usittg  a new,  computerized  version 
of  the  Multiple  Task  Performance  Mattery.)  for  all  three  task  measures,  the 
differences  across  task  comt)inations  are  significant,  (It  may  or  may  not  be 
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Figure  3.  Monitoring  perform.ince  as  a function  of  task  combination  as  sliown 
in  Table  3. 
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important  that  the  longest  response  times  for  ttie  1 i qiit-moni  toring  tasks 
were  associ.jted  witti  a different  task  c«jmi)in,itiot)  than  were  ttie  longest 
respottse  times  for  the  meter-monitoring  task.)  Significant  differences  were 
also  found  hetweert  task  coml)i  nat  i ons  for  ttie  tracking  task  (vector  flMS  error) 
and  for  the  problem  solving  task  (redundant  responses).  Neither  the  mental 
arittirTK'tic  task  nor  the  pattern  discrimination  task  showed  significant 
differences  as  a function  of  task  combination.  This  l.ick  of  differences 
could  mean  that  these  latter  tasks  are  less  sensitive  to  workload 
variations,  or  it  could  mean  that  they  were  given  higher  priorities  by  the 
subjects.  Although  a detailed  evaluation  of  exactly  how  to  account  for  the 
differences  across  tasks  is  not  relevar\t  to  our  purposes,  some  general 
observations  are  perhaps  in  order. 

The  data  of  Figure  3 are  based  on  the  mean  of  two  1-liour  sessions;  the 
subjects  had  had  a total  of  about  7 tiours  of  practice  on  ttie  tasks  before 
the  first  of  ttiese  sessions  and  10  liours  of  practice  before  the  second. 

Among  tlie  literally  hundreds  of  subjects  who  have  learned  to  perform  these 
tasks,  it  has  been  typical  that  the  subjects  initially  have  difficulty,  for 
examfile,  completing  arithmetic  problems  in  the  allotted  ?0  seconds  with  any 
time  to  spare.  Similarly,  they  frequently  get  "tiung  up"  on  the  problem- 
solving task  at  ttie  expense  of  ttie  other  tasks,  even  tFiough  they  are 
reminded  during  trainincj  tFiat  they  are  to  attend  to  all  tasks.  Thus,  tlie 
learning  procedure  typically  consists  of  first,  acquiring  skill  on  tFie 
individual  tasks  <ind,  then,  gradually  learning  to  shift  rapidly  and 
efficiently  from  a given  active  task  on  which  their  attention  may  be  focused 
at  a given  time  to  concurrent,  demands  (e.g.,  ttie  onset  of  a red  ligtit  or 
another  active  task);  or,  on  satisfaction  of  the  momentary  demands  of  ttie 
active  tasks,  ttiey  may  stiift  to  sc.mning  ttie  p.inel  for  monitoring  signals. 

It  is  al'>o  ch'ar  ttiat  even  at  tiigti  levels  of  trainini],  ttiere  are  sutistantial 
individuc.  I differences  in  ttie  smoottin<*ss  and  sp«'ed  witti  wtiicti  .ittention 
appears  to  be  stiifted  from  exercising  one  kind  of  tiehavioral  process  to 
anottier,  different  kind  of  process,  lor  ttiis  and  other  reasons,  a study  was 
undertaken  (30)  to  determine  wheittier  an  i ndep<>ndent  (time  stiarinq?)  skill  in 
ttiis  dom<iin  could  lx,-  identified  tiy  usiru)  ttu'  tectiniques  of  factor  analysis. 

In  ttiis  study,  ttie  lights  (red  and  green)  ,ind  ttie  meter-monitoring  tasks  were 
found  to  lo<jd  on  separ.ite  factors  wtien  performed  as  individual  tasks.  When 
p«*rformed  as  part  of  a complex  task,  ttiese  monitoring  tasks  all  loaded  on  a 
ttiird,  independent  factor.  If  ttiese  results,  wtiicti  sug()est  a possible  time- 
stiaring  atjility,  stiould  tiold  up  on  replication.  Important  implications  are 
suggested  for  ttie  s<*lection  of  sutijects  to  tie  used  in  various  kinds  of  tests 
of  systems  <ind  system  components. 

Ttie  synttietic  work  mettiodology  tias  yielded  otti«*r  results  of  relevance  to 
ttie  use  of  secorutary  lo,iding  tasks  as  m<'asur<‘S  of  workload.  In  a study  of 
ttie  effects  of  blood  alcotiol  levels  of  approxim.ite ly  0,1  percent,  a device 
ttiaL  w.is  differ«*nt  from  ttie  ftultiple  Task  Performance  Battery  described  above 
was  us«*i1,  tint  the  n-guirements  for  tinx*  sti.iriiu)  were  similar;  performance  of 
different  eomlii  nat  ions  of  mental  ar  i ttinx-tic,  monitoring,  and  two-dimensional 
tr.icklng  t.jsks  w,is  required  (lb).  Ttie  results  stiowed  ttiat  the  monitoring 
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tasks  were  affecle'ti  at  each  of  the  two  levels  of  workload  used,  hut  the 
trackinq  task  was  affected  otily  at  the  liiqher  of  the  two  workloads  (tracking, 
monitoring,  and  ari  ttinx-tic) . The  arithmetic  task  was  not  s i gn  i f ic.int  1 y 
.iffected  under  either  workload  condition.  In  this  study,  ttie  subj(;cts 
apparently  regarded  tiie  arittimetic  task  as  being  a "primary"  task  and  (jave  it 
priority  over  the  otiier  tasks;  it  could  pertiaps  be  argued  that  ttie  subjects 
"protected"  tiiei  r arittimetic  performance  at  ttie  expense  of  ttie  ottier  tasks. 
Wlien  just  the  tracking  and  monitoring  tasks  were  presented,  it  could 
similarly  be  argued  ttiat  ttiey  placed  priority  on  ttie  trackin(|  task  and 
"protected"  ttiat  performance.  Whettier  or  not  ttiese  proposed  interpretations 
are  accepted  as  reasonable,  it  seems  clear  (and  very  commonsensical ) ttiat  ttie 
priority  an  operator  assicjns  to  a task  will  be  an  important  factor  in 
determining  ttie  level  of  performance  maintained  on  ttiat  task  as  ottier  duties 
are  added. 

I V . Analytic  and  Synttietic  Mettiods. 

Ttie  methods  to  tn-  discussed  in  ttiis  section  have  been  somewtiat 
artiitrarily  categorized  as  analytic  or  synttietic.  (Rotti  types  of  mettiods 
tiave  sorm-  elements  of  each  general  approach,  but,  in  my  opinion,  the  first  to 
tie  discussed  leans  a little  more  in  ttie  analytic  direction  and -ttie  second,  a 
little  more  in  ttie  synttietic  direction.) 

A.  Analytic  Mettiod.  Senders  tias  been  a major  proponent  of  the  analytic 
mettiod  of  workload  analysis  (44,43,46,47,48).  This  basic  approacti  rests  on 
the  following  assumptions  (41^): 

1.  Visual  distributiori  of  attention  is  the  major  indicator  of  operator 
work  load. 

?.  Ttie  various  signals  ttiat  must  tie  monitored  demand  attention 
comnv'fisurate  witti  ttie  ctiaracteristics  of  ttie  signal  and  ttie  required  precision 
of  readout  of  ttie  sicjnal  by  ttie  tiuman  operator. 

3.  Ttie  tiuman  operator  is  effectivtily  a s inq le-ctianne  1 device  capatile  of 
attending  to  only  one  signal  at  any  time. 

4.  Ttie  protiatii  1 i ty  of  tiuman  fai  lure  at  any  tim<-  is  equal  to  ttie 
probability  ttiat  two  or  more  signals  will  demand  simultaneous  attention. 

Senrk'rs  stales  ttiat  ttiese  are  simplistic  assumptions  in  ttie  sf'nse  ttiat 
other  signal  sources  (e.g.,  auditory)  are  not  considered;  attention  to  ttie 
visual  part  of  continuous  nvinual  control  tasks  is  not  considered;  and 
periptieral  vision  is  not  taken  Into  account.  Ttius,  the  major  analyses  tiave  to 
do  prim.irily  wiUi  instruiwnt  layout  and  deal  only  with  requlreiw'nts  for 
instrument  reading  as  a source  of  workload. 
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'\n  imporl.uil  feature  of  Uiis  approach  is  ttiat  it  can  be  applied  in 
advaitce  of  the  existence  of  specific  hardware;  it  requires  only  that  certain 
conditions  he  spe'ci  f i ab  le.  For  a given  visual  display,  if  the  following 
information  is  available,  then  workload-related  parameters  can  be  calculated: 

1.  The  maximum  or  cutoff  frequency  of  ttie  display  must  be  specified. 

From  this  figure,  tFie  required  fixation  frequency  as  a function  of  tirm'  can 
be  calculated. 

Signal  amplitude  and  acceptable  error  of  reading  must  be  specified. 

F rom  1 and  ? tlie  information  rate  for  the  display  can  be  calculated. 

From  the  inforrmition  rate,  the  fixation  duration  can  lie  calculated  (on  the 
basis  of  ttie  known  relation  between  information  content  and  response  time). 

Ttie  product  of  fixation  frequc'ncy  and  duration  of  observation  yields  tFie 
time  required  for  ol)serving  ttie  display  expressed  as  seconds/second.  The 
times  found  for  eacFi  display  instrurm'nt  can  t)e  summed  to  get  an  index  of 
monitoring  workload  as  total  seconds/second  required  overall  in  observing 
instruments.  If  uncorrelated  signal  sources  are  assumed,  transition 
prol)abi  lities  (e.g.,  probability  of  looking  at  display  tl  after  having 
observed  display  A)  can  be  calculated  and  thus  lead  to  guidelines  for 
optimim  instrument  layout. 

Senders  (4f>)  tested  tFiese  notions  in  a laboratory  situation  by  using  four 
meters  that  were  driven  at  different  frequencies.  He  then  compared  predicted 
fixation  frequencies  based  on  tlie  display  cliaracter i st i cs  with  fixation 
frequencies  as  determined  l)y  motion  pictures  of  the  eye  positions  of  the 
suF)Jects.  The  agreement  t)Ctween  prediction  and  data  was  quite  good. 

SuF)sequent  ly , CarF)onell,  Ward,  and  Senders  (9)  compared  predictions  witFi  data 
from  pilot;  flying  approaclies  to  l.tnding  in  a simulator.  Instrument  pickoffs 
were  used  to  establisFi  the  frequency  characteristics  of  tlie  various  instrument 
displays  artd  eye-movement  measures  were  used  to  determine  fixation 
frequencies.  The  aqreem<*nt  betweett  tlie  values  from  tFte  prediction  procedures 
(flyquist  model)  previously  used  (49)  and  tlte  data  was  leasonably  good; 

Fiowever,  a queueing  theory  model  gave  substantially  better  agreement. 

CletTx-nt,  3ex,  and  Graltam  (17)  describe  ttie  application  of  a "manual 
control-display  ttieory"  to  instrunH'nt  landings  of  a "large  sutisonic  jet 
transport."  Ihis  ttieory,  detal  led  by  McRuer  and  Jex  (37)  and  McRu<“r,  3ex, 
Clenx-nt,  and  CraFiam  (38),  attempts  to  use  tiypottiesl/ed  ratios  between 
fixation  fregiK:ncies  and  display  bandwidttis  that  are  tailored  to  ttie 
accuracy-of-control  requirements  for  ttie  particular  display.  Then,  using  a 
procedure  ottierwise  similar  to  that  described  by  Senders  (49),  Clement  et  al . 
(17)  computed  a fractional  scanning  workload  index  for  eacti  display  function 
and  summed  these  arithmetically  to  get  a quantity  that  is  equivalent  to  a 
seconds/second  scanning  index.  They  showed  that,  as  a design  exercise,  the 
predicted  scanning  workload  for  a selected  aircraft  panel  layout  could  be 
reduced  from  1.3?  (anything  greater  than  1.0  is  overload)  to  1.01  l)y 


comt)itutU)  certain  displays.  Mtluuujh  tlmir  predicted  l)(!st  display  .sr ranpement 
".iprees  vvitli  that  .K'tn<illy  adoptf'd"  by  <i  major  airline  for  f \\  C.itefjory  II 
cert  i fical  ion  , empirical  v,i  1 i <l<it  i ons  of  scan  tiws  and  fixation  durations  <tre 
not  presented.  In  a sidisecpient  study,  i^e  i r arid  Klein  CtU)  collected  d,ita  by 
using  a "DC-S"  fll()bt  simulator;  bowexet  , their  results  in  terms  of  scan  times 
were  compared  with  prexious  findimis  with  aircraft  and  simulators  rather  than 
with  tbeort't  i c<i  1 predictions  based  on  dis(>l.iy  inform.it  ion.  lurtber  discussion 
of  this  an.rlytic  .tpproacb  c.tn  be  found  in  Mien,  ('lement,  ,ind  ,')<-x  (1). 

The  analytic  .tppro.K’b  to  wurl-  lo.id  prediction  reguires  considerable 
knowledge  about  the  cb.ir.icter  ist  i cs  of  the  foreinr)  functions  of  the  various 
inst rumt'rrt s and  displays,  lint,  vdiere  such  iriformation  is  .ivailable',  the 
methodology  developed  to  d.rte  shows  promise,  especially  in  applications  to 
new,  desi (jn-stage  systems,  llowf'ver,  substanti.il  effort  in  the  empirical 
validation  of  ttie  procedurr‘S  is  still  nr-eded  <tnd  warrantefl. 

H,  Synthetic  liettiod.  Miat  is  heiruj  referred  to  here  as  the  synthetic 
method  might  equally  well  he  called  a comi)  inator  i a 1 rw-thod.  Tlie  point  of 
departure  of  this  mi'thod  is  .t  t.isk  .rnalysis  of  the  system;  the  [troitosed 
mission  or  operatifui  profile  is  broken  down  into  serimcnls  or  phases  that  are 
relatively  tiomofjeneous  with  respect  to  the  way  the  system  is  expected  to 
operate,  lor  each  such  missiori  [>h.ise,  the  specific  ()erformance  demands 
pl.iced  on  the  oper.itor  are  identified  through  task  .inalysis  procedures.  Once 
ifidividikil  tasks  .jnd  suhtasks  have  been  isolated,  previously  available 
(«'.g.,  fluriger,  Smitti,  and  Payne,  HO)  or  <id  hoc  dat.i  are  compiled  on  the 
p<-rformance  of  the  tasks  with  both  (jerform.ince  times  and  operator 
reliabilities  being  taken  into  account.  The  information  on  performance  times 
is  then  accumulated  for  .i  given  mission  [ih.ise  and  the  resultant  sum  is 
com(jared  with  the  predictiul  dur.ition  of  the  (iluise.  The  comparison  of  these 
two  guantitif,'S--time  reguired  to  perform  versus  time  aval  lab le--can  be  used 
to  reflect  aii  index  of  workload.  Although  other  f.ictors  can  be  included  in 
this  synthesizing  process,  tim<-  is  typically  the  prim.iry  v.irialile  considered. 

One  ex.imple  of  this  approach  is  the  Cockpit  I valuation  and  IV'siqn 
Aruilysis  System  described  by  tirown.  Stone,  <ind  Pe.m'e  (8).  Brown  et  al . 
define  workload  as  follows:  "I  light  crew  workload  is  the  ratio  of  the 

Summation  of  ,'e‘guired  crew-eguipment  pi^rformance  tim«'  to  tlu'  tlnx'  available 
within  the  constraints  regulated  by  ,i  ijivfui  flight  or  mission."  Their  design 
and  analysis  system  is  computerized  and  is  ortjanized  in  such  a way  that 
detailed  information  can  be  included  rcf|ardin(|  re>guired  times,  available 
timers,  items  of  eguipfin-nt  involved,  and  flight  phases  as  w<  1 1 as  the  design 
personnel  responsible  for  the  various  egui()ments  <ind  sutisystems. 

flight  phases  <ire  further  brokini  down  by  ident i f icat ion  of  what  they  call 
milestones,  a milestone  beiruj  a chanqi;  in  he.iditu),  .lirspeed,  altitude,  etc. 
Preliminary  allocations  of  duties  and  activities  are  liased  on  operating 
techniques  of  expert  pilots  and  operating  proctulures  for  similar  aircraft, 
for  purposes  of  workload  predict  ion  for  a tjiven  seijment , th(>  conuniter  output 
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is  expressed  in  t tic  form  of  percent  <i(|<--of-ecip<ici  ty  fiqures  for  edcti  tdsk 
eleiTM-nt  edcti  crew  memlx'r  is  to  perform.  In  this  way  critical  periods  in  a 
missioti  phase  can  l>e  identified  and  possit)le  corrective  measures  evaluated. 
TtK‘  primary  purpose  f)f  the  desi()n  analysis  system  "...  is  to  ()rovide  d<it<i 
for  use  in  com|)arative  evalu.ition  of  alternative?  crew  station  designs."  Its 
major  v.ilues  are  the  ease  witti  wtiich  system  ctuinges  can  he  evaluated.  As 
firown  et  al.  state:  "Any  workload  reduction  must  he  evaluated  in  terms  of 

the  context  wittiin  which  ttii  s occurs  .ind  it  seems  senseless  to  increase  cost 
hy  automatifu}  a fe-ature  that  saves  work  during  low  wf>rkload  periods  only." 

There  arc  a numl)er  of  other  instances  of  the  application  of  the 
synthetic  met Ixidoloijy  to  the  problems  of  worklo.id  prediction.  Although  the 
basic  approaches  are  simil.ir,  there  .ire  some  potentially  important 
differences  in  detail,  for  example,  Klein  and  Cassidy  (3?)  describe  an 
approach  to  estimating  work  regui rem<-nt s in  which,  apparently,  an  average 
required  perform<ince  tinx*  is  used  to  reflect  Uie  contribution  of  eaeft  task 
to  the  total  work  regui renu-nts , but  tfie  sum  of  tliese  times  can  exceed  the 
tirrx-  dvailablf'  and  thus  lead  to  the  notion  of  timt'  stress.  Their  general 
procedure  for  .in.ily/ing  ttie  mission  regui  rerrxTit  s is  basic.illy  as  flescribed 
above.  Kiein  and  Cassidy  <ilso  point  out  the  neerl  to  recogni/e  the 
nonadditivity  of  worklo.id  eh'tiwnts.  Ttiis  nonaddi  t i vi  ty  was  investigated  by 
ev.iluating  a tracking  t.isk  wtien  perfornx-d  in  conjunction  with  a discrete 
t.isk;  they  concluded:  "Workload  elertH'nts  do  not  interlace  in  a directly 
additive  f.ishiun." 

Wing«‘rt  (S6)  pi. ices  considerahle  emptiasis  on  the  fact  that  the 
perform.ince  of  two  tasks  in  combin.ition  often  represents  a workload  that  is 
less  tfian  ttic  sum  of  the  individual  workloads.  He  used  a model  that  took 
account  of  tfie  nature  of  tlie  task  input  (visual,  auditory,  or  kinesthetic) 
and  the  task  output  (motor,  vocal,  or  none  required).  He  then  prepared  an 
"interl.ice  table"  for  different  comtiinatlons  of  two  tasks  witti  the  various 
possible  comliinat ions  of  input  and  output  modes.  Tfie  actual  values  used  in 
tfie  table  depended  on  an.i  lyses  of  tfie  scannirK)  regui  reme'nts , information- 
processing-tifw  predictions,  and  tfie  set  of  summation  rules  assumed  to  apply 
to  particular  pairs  of  inputs  <ind  outputs.  A specific  set  of  tasks  was 
evaluated  by  using  a fixed-base  fielicopter  simulator,  and  "interlace 
C(K‘fficients"  were  determined.  Tfie  result.int  coefficients  are  used,  in  the 
simple  case,  as  follows: 

Total  workload  = Wl  (1)  + Wl  (?)  - I * WL  (?) 
wfiere  I = tfie  interlace  coefficient. 

Wingert  discusses  the  concept  of  interlacing  in  the  context  of  parallel 
vfirsus  serial  processing  of  information,  .irid,  in  general,  tfie  amount  of 
interlacing  expected  depends  on  tfie  extent  to  wliicti  paraliel  processing  is 
possible. 
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This  notion  of  interlacincj  can  also  ho  view(>h  from  the  simpler  time- 
stiarinrj  frame  of  reference.  The  highly  skilled  operator  tias  typically 
"automated"  many  aspects  of  his  complex  task  in  a way  sucti  tiiat  many  of  ttie 
elements  require  little  if  any  information  processiru)  (channel  capacity)  for 
satisfactory  execution  of  the  required  behaviors.  Consider  a two-dimensional 
tracking  task  as  represented  by  the  iristrument  landing  system  (ILS)  display. 
Assume  that  the  pilot,  on  approaching  the  outer  marker,  observes  tliat  tie  is 
slightly  (but  undesirably)  below  glide  slope.  Through  long  experience,  he 
is  able  to  apply  an  appropriate  adjustment  ttiat  will  bring  the  aircraft 
smootttly  to  ttie  glide  slope.  He  does  not  then  sit  and  watch  the  needle 
slowly  dropl  He  turns  his  attention  to  other  displays  (e.q.,  airspeed)  arid 
knows  approximately  when  to  return  his  attention  to  the  ILS  display. 
Similarly,  once  he  has  the  ILS  needles  centered  and  lias  established  a proper 
rate  of  descent,  only  under  very  adverse  conditions  of  wind  and  buffeting 
will  he  have  to  give  the  ILS  display  his  undivided  attention.  In  ottier 
words,  how  often  he  must  look  at  a display  to  Insure  satisfactory  performance 
depends  on  the  "forcing  function"  acting  on  that  display  and  the  criticality 
of  the  task  in  terms  of  permissible  error  rates  and  an^ilitudes  (cf.  Senders, 
49).  To  consider  another  kind  of  behavior,  the  neophyte  automobile  driver 
must  give  most  of  his  attention  to  the  steering  task  of  "keeping  ttie  car  on 
the  road."  Lor  the  expert  driver,  steering  is  concerne<i  with  avoiding  rough 
spots,  maintaining  safe  separations  from  oncoming  traffic,  etc.;  keeping  the 
car  on  the  road  t\as  been  automated.  And  i f we  look  far  enough  we  may  run 
across  an  oldtime  telegraph  operator  who  can  send  or  receive  a message  while 
simultaneously  telling  us  about  the  good  old  days. 


Howf'ver,  we  should  keep  in  mind  that,  at 
the-art,  caution  is  in  order  in  assuming  too 
may  be  liighly  vulnerable  to  stress  and  other 
analogy,  ve  do  not  want  an  aircraft  designed 
expected  g and  gust  loads. 


least  at  the  present  state-of- 
much  ititer lacing.  Such  skills 
such  factors  (13).  By  way  of 
to  just  withstand  the  maximum 


V,  Simulation  Methods, 


A,  Fidelity.  W«*bster  (S3)  defines  a simulator  as  "one  that  simulates, 
sped f ; a device  in  a laboratory  that  enables  the  operator  to  reproduce 
(jnder  test  conditions  phenomt?na  likely  to  occur  in  actual  performance."  If 
we  interpret  the  word  "phenomena"  to  mt'an  "system-operating  characteristics ," 
then  the  dictionary  definition  certainly  states  the  intent  of  the  designer 
of  the  simulator.  Chapanis  (10)  considers  a simulation  to  be  a kind  of  model 
and  prefers  to  define  models  as  simply  being  analo(|ies  of  some  particular 
part  of  the  real  world  that  is  of  interest  to  ttie  model  maker.  Chapanis 
makes  a good  case  for  ttiis  usage,  and  an  Important  value  in  thinking  of  a 
simulation  as  being  an  analogy  is  that  we  are  all  <iware  that  analogies  tend 
to  come  apart  wtien  they  are  pushed  too  tiard  or  are  examined  too  closely, 

Wtien  we  talk  about  fidelity  of  simulation,  we  are  thus  talking  aliout  "how 
hard  we  can  push"  before  the  analogy  breaks  down. 


?0 


The  diffiailtics  cncouii lered  in  dctiicvimj  ixlcqucitc  fidelity  in  ,i 
simuliilor  .irn  primurily  «3  function  of  the  purpose  for  vviiich  the  simuldtor  is 
to  Ix'  used.  Ttius,  for  sonxj  purposes,  a control  stick  <ind  <i  display  with  an 
appropriati'  interface  provide  adequate  levels  of  fidelity.  -\s  Hopkins  (?8) 
h.is  said,  the  kinds  of  thinqs  tiiat  are  needed  on  a simulator  depend  on 
"(1)  your  purpose  in  usinq  it,  and  (?)  your  method  (jf  usinq  it.  . . . Cost 
effect  iver\es5  has  not  been  demofistrateri  for  all  the  hells  and  whistles  that 
ct)iT*'  t)S  standard  trimmirujs  on  our  current  fliqht  traininq  simulators." 

H.  Assumptions . Ttu;  basic  assumption  underlyinq  ttie  use  of  simulation 
iti  virtually  any  context  is  that  the  device  represet\ts  to  a satisfactory 
deqree  those  elements  of  the  system  beinq  simulated  that  are  important  and 
relevant  to  the  purposes  of  the  enterprise  beinq  undertaketi.  More 
specifically,  in  usinq  a simulator  to  study  pilot  worklo<id,  it  is  assumr'd 
t hat : 


1.  Those  factors  in  the  real  system  that  are  relevatit  and  important  to 
the  operator  functions  beinq  evaluated  are  present. 

?.  Those  aspects  of  the  simulation  that  differ  from  the  real  system  will 
not  introduce  inportant  disturbances  in  the  trx?asures  beinq  taken. 

1.  Behavioral  effects  of  task  manipulations  can  t)e  isolated  from 
simulator  operatinq  characteristics  as  sources  of  variance. 

4.  The  perfortTvince  effects  of  the  variables  beinq  manipulated  in  the 
simulation  do  not  importantly  differ  from  tin;  effects  that  would  occur  in  the 
real  system. 

Most  of  be  work  that  has  focused  ott  the  evtiluation  of  the  usefulness  of 
simulators  has  been  done  in  the  context  of  the  substitution  of  simulator 
traininq  or  experience  for  actual  fliqht  traininq  or  experience,  and  even  in 
this  area  m.iny  questions  reqardinq  traininq  simulators  have  been  at  best  only 
partially  answered.  (A  special  issue  of  Htiaan  factors  (1963,  No.  6)  was 
devoted  to  this  problem  <irea.) 

Unfortunately,  many  of  the  investiqat ions  that  have  looked  at  workload 
and  other  desiqri  questions  usinq  simulation  have  been  reported  in  private 
company  or  laboratory  internal  publications  or  not  at  all.  Thus,  the  open 
literature  is  virtually  devoid  of  well-docunH'nted  studies  in  which 
simulation--in  the  ordinary  m«‘aninq  of  that  term — was  used  to  investiqate 
workload;  i.e.,  wttere  measures  were  taken  from  the  simulator  to  provide 
indices  of  the  performance  effects  of  workload  variations  as  produced  by 
clianqes  in  the  simulator  tasks. 

C,  A F li()ht  Simulator  ( xample.  Corkindale  (?0)  reported  a study  of 
missile  control  performance  as  a function  of  concurr«'nt  workload  usinq  a 
fixed-base  fliqht  simulator.  The  study  included  the  followlnq  workload 
condl tions : 
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1.  Missile  control  t.isks  onl\.  ( lv\o-(l  i nw-ns  i oiui  I trockirK]  nsinf)  o joy 
stick  witti  the  left  ti.ind  .itid  .t  U displ.iy.  ) 

Simiildtor  m.if\n,il  control  nsiiuj  d lle.id-Mp  l)is|)l.iy  (HllO)  . ( Iwo- 

diPMTisi on.i  1 trockituj  v\ith  control  eolumti.) 

Missile  control  plus  HMM  m.inudl  Control.  (f^o,  independetd,  two- 
dimension, 1 1 tr,tckin(]  t ,Hsks--one  with  left  hund  ,ind  otte  witli  riqht  luind.  At 
ttie  end  of  first  'J(1  seconds,  tlie  U come  on  ,ind  the  subject  wotched  f(>r 
dppe.ir.ince  of  t.irrtet.) 

4.  Missile  control  t.isk  [)lus  HUD  monitoriru|.  ( Iwo-dimei\si  ona  1 tr.ickinq 
of  missile  plus  monitorinq  of  HMO  for  ,in  i n f requetit  ly  presented  siqnal  th.it 
subject  responded  to  h\  pressinq  ,i  button  on  the  cordrol  column.) 

Perforrrkince  of  the  missile  and  direr. ift  control  t.isks  was  me.isured  by 
recordiriq  intcfirated  error  in  each  axis  for  each  trackinq  task.  In  adriition, 
detection  t iiry  for  ttie  TV  tarqet  was  me.isured.  Once  the  TV  tarqet  was 
acknowledqed  and  the  crosshairs  bad  a()[)e.ired,  the  missile  tr.ickinq  task 
lasted  just  10  seconds;  the  HUT)  .lircraft  control  task,  when  present,  lasted 
for  .ipproximate ly  5 minutes  10  seconds;  the  missile  control  task  always  fell 
in  the  second  half  fif  the  test  trial. 

All  but  one  of  ttie  measures  evaluated  were  s i qni f 1 can t ly  affected  by 
workload;  surpr i s inq 1 y , horizontal  error  in  tr.ickinq  the  TV  display  tarqet  was 
not  sensitive  to  ttiese  workload  variations.  A major  conclusion  drawn  by 
Corkindale  (PO)  was  th.it  his  findiruis  fit  well  with  ttie  work  ttiat  llolfe  (4?) 
reviewed  .ind  inti'rpreted  to  indicate  ttiat  second. iry  tasks  typic.illy  produce 
degradation  of  ttie  perf orm.irute  of  ttie  primary  t.isk  in  spite  of  instructions  to 
maintain  the  'iic|tiest  level  of  performance'  on  tti.it  t.isk.  It  would  be 
interesting  to  know  wti.it  sort  of  prediction  t tu-  .inalytic  mf'ttiod  of  estimating 
workload  (e.g..  Senders,  4P)  would  make  as  reijards  ttie  task  combinations  used. 
Corkindale  cites  evidence  ttiat  ttie  subjects  spent  a significantly  sm.il  ler 
percentage  of  ttie  time  looking  at  ttie  Hlin  wtien  ttie  TV  w.is  on  percent) 

ttian  wtien  ttie  TV  was  off  (tSO.T  percent),  even  t tioucjti  the  HMO  was  ttie  primary 
source  of  fei'dback  to  Itie  sutiject  as  to  tiow  well  tie  was  controlling  ttie 
aircraft.  Ttierefore,  one  would  be  tempted  to  speculate  that  ttie  analytic 
method  would  predict  tliat  a pilot  cannot  do  tiotti  of  ttie  t.isks  wittiout  at  least 
some  degradation  of  performance  on  tiotti.  Wliat  , ttien,  stioulrl  we  expect  the 
pilot  to  do  wtien  we  ask  him  to  try  to  do  botti  tasks  simultaneously?  Assuming 
ttiat  the  pilots  used  in  sucti  a study  were  mission  oriented,  ttien  their 
approacti  to  ttie  situation  migtit  very  well  tie  as  follows: 

"This  is  <in  exercise  in  whicti  1 am  expected  to  tiit  .i  tarqet  with  an 
ai  r-to-sur  face  guided  weapon.  I tiave  to  control  ttie  missile  and  fly 
ttie  airplane.  T know  ttiat  1 cannot  fly  as  well  wtiile  controlling  the 
missile  as  I can  wtille  I am  not.  So,  1 will  try  my  tiest  to  tiit  the 
tarejet  and  will  consider  ttie  mission  ,i  success  if  I score  a tiit  and 
do  not  crasti." 


It  Could  t)(‘  ,ir()uc(i  tti.it  m.iny  mLlit.iry  pilots  would  follow  this  line  of 
reason  1 IK)  unless  they  were  told  that  they  must  m.iintain  undiminished 
cont  rol  of  the  aircraft  even  if  they  never  hit  any  tarcjcts.  And  with 
instructions  of  that  sort,  it  miqht  he  difficult  to  m.iintain  i)ood  levels  of 
subject  motiv.it  ion  to  jierform  the  task. 

Assumini]  ttiat  Corkinda  le ' s subjects  were  able  to  liandle  the  aircraft 
control  task  in  a manner  that  satisfied  them  when  ttiat  was  their  only  task, 
what  does  a (s  1 i)n  i f i c.int  ) doubliru)  of  the  error  scores  with  the  addition  of 
ttie  IV  t.isk  me  .HI?  [)id  the  [)ilots  t h i nk  they  were  control  linq  the  aircraft  in 
.HI  <icce[)t .ih  1 e m. inner  in  the  two-t.isk  condition?  Miether  ttiey  did  or  not, 
what  was  ttieir  criterion?  Did  any  of  tltem  ever  "crash"?  Wittiout  some  sort 
of  .ibsoliite  error  criterion,  the  interpretation  of  the  results  in  this  kind 
of  s t ud\  Tor  .iny  simulator  study)  is  very  difficult.  We  are  on  somewhat 
firmer  qround  if  the  purpose  of  a study  is  to  compare  ttie  workload  properties 
of,  for  ex.im))le,  two  alternative  ways  of  displ.iyinq  ttie  same  information.  If 
there  is  a suhst ant i a 1 and  stat ist ical ly  siqnificant  advantage  of  one 
alternative,  ttien  cost-versus-ef fectiveness  analyses  can  tie  made.  Hut  even 
in  this  simple>  cast',  the  ahsei.ee  of  absolute  criteria  creates  protilems;  for 
example,  wtiat  iirocedure  can  be  used  to  estahlisti  wtiat  a "substantial 
arivant.Hje"  is  in  relation  to  "real  world"  requirements?  In  other  words,  we 
must  not  forget  that  in  many  im))ortant  resjiects  a simulation  is  merely  an 
analog  of  some  asfiect  of  ttie  real  world. 

I).  A S[).ice  Simul.itor  fx.imple.  Cotterman  and  Wood  (21)  attempted  a direct 
treatiTK'nt  of  the  prot>l«-m  of  criteria  in  .i  simul.jtion  cont(‘xt  in  a study  of 
ttie  retention  of  )>ilot  skills  associated  with  .i  lunar  landing  mission.  Ttiis 
study  involved  a full  mission  simulation  at  the  Mart in-Mariet ta  Corjioration 
as  part  of  ttie  NASA  S))ace  jirogram.  the  sutijects  in  this  study  were  1? 
aerosp.ic'  researcti  pilots  wtio  tiad  (larticip.ited  previously  in  a Hum.in 
Heliahility  Program  study  conducted  with  ttiis  simulation  system.  Ttie 
specific  goal  of  ttie  study  rejiorted  by  Cotterm.in  and  Wood  was  ttie  evaluation 
of  ttie  retention  of  skill  after  relatively  long  periods  (13  weeks)  of  disuse. 
Ttie  total  study  concerned  nine  separate  mission  ptiases,  witti  from  one  to  four 
performance  criteria  for  e.icti  [itiase.  lor  [iresent  purposes,  only  one  phase 
will  be  discussed;  vi  / , ttie  "brake  and  Hover"  )ihase  involved  in  ttie  lunar 
landing. 

ikised  on  engineering  analyses,  |)ermissi t)  1 e error  rates  had  been  estab- 
listied  for  four  motion  param»-ters  duriru)  ttie  Brake  and  Hover  ptiase.  These 
were:  displ.icement  (or  range  error),  POO  feet;  displaceriK'nt  rate,  10  feet/ 

second;  imfiact  rate,  10  feet/second;  percentage  fuel  consumt'd,  9b  percent. 
Exceeding  ttiese  values  tiy  a[)|)reciab  le  amounts  would  incur  unacceptable  risk 
of  mission  failure. 

The  .in.ilytlcal  .giproacti  .ip|)lied  by  Cotterman  and  Wood  was  to  use  the  data 
on  ttie  last  four  training  trials  for  eacti  [>ilot  to  establish  a mean  and  a 
standard  deviation  for  each  ))ar.imeter.  Sine*'  their  interest  was  in 


cs  t .it' 1 i stiinq  wtieltu-r  subjects  could  .ittoii'  perf  ormai\c<'  .it  .i  tiif)ti  lev<-l  of 
rofis  Is  tcfu-y  , tticy  sel<?cted  .1  st.itistic.il  criterion  thjt  iv.is  associ  at  cd  with  ,i 
prob.it' i 1 i t y of  0,y!)0  that  ttic  sutiject  would  perform  wittiin  ttie  criterion 
toler.inces.  ftie  .ictual  c.ilcul.it  ions,  ttioupti  somewtiat  lat'orious  if  don<>  liy 
ti.ind,  ore  conceptually  simple,  lirst,  llte  stanrlani  tieviatio/i  for  the  rJata 
from  a piven  pilot  for  a qiven  measure  is  ci'mputed;  tlien,  a normal  deviole 
("/"  score!  is  found  t>y  dividin<)  ttu*  diffi'rence  between  tti«>  criterion  and  tlie 
ot't.iinect  scor<‘  of  interest  t'v  ttie  standard  deviation.  A tat'le  of  normal 
deviates  can  ttien  lie  used  to  estatilisti  an  approxim.it  ion  of  ttie  prot'at'i  1 i ty 
tti.it  ttie  pilot  in  question  will  in  fact  lie  expected  to  stay  wittiin  the 
criterion,  or,  usinq  the  appropriate  equations,  an  exact  protiatii  1 i ty  can  be 
computed.  For  orie  subject  in  ttie  study  reported  by  Cotterman  and  Wood,  it 
w.is  found  tti.it  prot'.it'i  li  t ies  of  stayini)  within  the  criterion  on  ttie  four 
previously  mentioned  variat'les  were:  0.998;  O.bP');  0.999b;  and  0.999b.  If 

ttie  events  on  wtiich  eacli  of  these  prot'at'i  1 i t ies  is  tiasect  are  iiutependent , ttien 
their  cinulative  product  is  ttie  probatiility  that  ttie  entire  mission  ptiase  will 
tie  wittiin  ttie  criterion  limit.  Witti  this  approacti , wtiettier  applied  to 
simulation  or  to  an  in-fliqtit  si t uat i on--assuminq  tliat  ttie  criteria  can  tie 
speci  f ied--ttie  proli.ibi  1 ities  cari  tie  developed  in  a way  ttiat  makes  ttiem 
useat'le  for  purposes  of  reliatiility  enqineerinq.  Ttie  requi  ri'ments  <ire 
(11  ttie  data  must  be  quarit  i t at  i ve  in  form,  (?)  enouqti  repetitions  per  subject 
must  he  provided  to  .ictiieve  re.isonably  reiiat'le  estimates  of  ttie  standard 
devi.ition,  .ind  ( i)  some  criterion  must  tn-  .iv.iilat'le  itiat  is  specifiatile  in 
quantitative  form. 

\ I . In-Fliqtit  Mettiods. 

A.  System-based  Mf‘asures.  Various  tectiniques  ti.ive  tieen  used  to  record 
indices  of  pt/rfortrance  in  aircraft.  Ttiey  tiave  involved  varyinq  deqrees  of 
difficulty  of  inst  illation  and  ti.ive  been  used  witli  varyinq  deqrees  of  success. 
SoiTK-  of  ttie  earliest  systems  used  voltaiie  analoqs,  eitlier  from  direct 
instrument  pickoffs  or  from  repeater  instrunHTit s , to  ilrive  ttie  pens  of  an 
ifik-writinq  osci  1 1 oqrapti . More  recently,  frequency  modulation  tectiniques  tiave 
tieen  used  to  record  analoq  siqnals  onto  m.iqnetic  tape;  off-line  comfiuter 
readout  arid  an.ilysis  ran  ttien  Ik‘  applied  to  tlie  tapes.  And  still  more 
reciTitly,  on-board  dirjiti/inq  tru'tiniques  tiave  tieen  used  to  record  data  on 
m.iqnetic  tape  directly  in  diqit.il  form.it  for  later  computer  analysis. 

Somr-  of  ttie  earliest  work  on  studies  of  aircrew  workload  involved 
variations  ori  ttie  standard  tectiniques  of  t im«‘-and-motion  study  (e.q., 
f'tirlstensrTi,  IF.),  and,  at  about  ttiat  same  time,  pilot  workload  (instrumtuit 
sc.innlnrj)  w.is  studied  by  use  of  motion  pictures  of  pilot  eye-movements  during 
ins* rumr-iit  .ipproacties  ()9).  Still  more  rr*c<'ntly,  Weir  <ind  Klein  (b4)  descritie 
ttie  use  of  an  1 ye-Point  of  Reqard  system  ttiat  uses  a tiori/ontal  movement 
d«-t«'ctor  (t'ite  t'oard)  and  corneal  reflection  to  qive  <i  resolution  of 
".it'out  *■  1 " in  el  ttier  axis  with  respect  to  ttie  eye  fixation  point,  f’tioto- 
qr.iptiic  .mil  videotape  tectiniques  tiave  also  been  used  to  record  qeneral  pilot 
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■K'l  i vi  t ics  Ln  s imii  1 ,i  t ors  .is  well  .is  .lirrr.ift;  t i m<7  t rct|iKfU'y  mc.isurcs 

of  ('ontrol  us.KH'.  And  still  more  rcc'ciit  | y , Cel  s<- Ituirl  , Sliifflor,  .iml  Ivejy 
(,■’4)  iiscii  t i rTK'-cind-mot  ion  st  udy  tcchninucs  in  cv.ilu.it  iiu)  crew  requirements 
for  the  KC  I tS  t. inker  direr. if  I on  .ictu.il  mi  0 s i (jn s ■ 

Koseoe  .ind  Willi()es  (4  1)  retiorted  .i  study  c.irrietl  fiut  in  <i  (k'l'cticrdf t 
r4tiH  usiiKi  e.ieh  of  e i <)h  t exper  iment  j 1 displ.iy  conditions  under  simuldted 
instrument  fliciht  conditions.  Ihe  I .isks  conf  ront  iruj  the  suhjects,  wtio  were 
naive  to  flyinc),  W'  re  (1)  tr.ickinq  a r.indomly  (lener.iti’d  comtn.ind  fliijht  patti; 

( <’ ) .1  disturhed  .ittitude  task  tliat  required  suhjects  to  com[)ens.ite  for 

G. iussian  noise  summed  with  the  actu.il  h.ink  attitude  siqnal;  and  ())  recovery 
fromunusu.il  .ittitudes  entered  with  suhlimin.il  anqul.ir  accelerations . All 
dat  .1  were  recorded  on  .i  strif)  re.-corder  and  on  maqnetic  tape.  Among  ottier 
results  reported  hy  Hoscoe  and  Vtilliiies  w.is  the  finding  th.it  ttie  m.i  i nt  en.ince 
of  comm.ind  headiru)  was  s 1 qni f i can 1 1 y better  with  the  displays  in  a pursuit 
mode  as  compared  to  a compens.itory  mode. 

Knoop  (13)  re'ports  .i  study  designed  to  ev.ilu.ite  the  feasitiility  of 
aut  om.it  i c.i  1 ly  .issessincj  f- 37  student  pilot  perform.ince  in  ttie  Air  Force 
IJfidergr.idu.it e Pilot  Training  progr.im.  A 1-3713  aircraft  was  instrumented  to 
record  .'’1  fliijht  .ind  control  p.ir.imeters  in  difjital  form  on  magnetic  tape. 

H. ijor  v.iri.ibles  (airspeed,  (litch,  roll,  stick  position  in  two  dimensions,  and 

rudder  position)  were  s.impled  100  times  per  s(“cond.  Other  variables,  such  as 
.iltitude,  tie.idiiui,  f 1 .ip  position,  etc.,  were  s.imjiled  .it  a 10-H/  rate.  A 

major  p.irt  of  this  effort  involved  .ittempts  on  Uk*  part  of  instructor  pilots 
to  fly  prescribed  m.ineuvers  in  as  nearly  perfect  a m.inner  as  possible.  Ttiese 
maiK-uvers  were  broken  down  into  phases  and  subjected  to  computer  .inalyses  in 
an  .it  tempt  to  flevelop  mi'.isures  that  best  ch.iracteri /ed  .i  hi()h  level  of 
perform.ince;  concurrently,  subjective  ratings  of  ttie  instructor  pilots  were 
also  Ilf  ed  .IS  p.irt  of  ttie  ev.iluation.  Ttie  result. int  functions  of  ttie, 
v.irious  control  and  perform.ince  (lar.imeters  were  comjiared  witti  Itiose  of 
student  pilots  to  try  to  identify  ttiose  measures  ttiat  best  discriminate 
tietween  trainees  .ind  skilled  pilots.  Overall,  ttiis  effort  met  witti  mixed 
success,  and  major  attention  was  diverted  to  trying  to  follow  ttie  progress  of 
students  ttirougti  ttie  traininej  program.  A m.ijor  difficulty  encountered  was 
ttie  clear  l.ick  of  .uireement  across  Instructors  as  to  wtiat  was  most  important 
in  ctiaracteri/inij  good  performance  in  particular  m.ineuvers. 

Il.isbrook,  IJ.ismiissen , .ind  Willis  (?7)  reported  an  in-fligtit  evaluation  of 
a "per  i ptii'r.i  1 vision  fli()til  display"  (I’VFD)  in  a l3<-ecticraft  llonanza  3bA 
.lircraft.  ( .icti  of  /*0  pilots  flew  two  ILS  approacties  witti  a conventional 
displ.iy  system;  ttiey  also  flew  five  approacties  witti  ttie  PVf  0 system,  but 
only  ttie  last  two  of  ttiese  .ipproacties  were  considered  for  data  analysis 
purposes.  I’e r form.inc«‘  levels  were  recorded  on  a 14-ctianiu'l  FM  analog  tape 
syste-m  installet)  in  ttie  left  rear  seat  of  ttie  aircraft.  Twelve  ctiannels  of 
in form.it ion  were  recorded:  pilot  tieart  rate;  aircraft  pitcti  and  roll  (taken 

from  ttie  prim.iry  attitude  indicator);  vertic.il  and  lateral  deviations  from 
ttie  ILS  center  1 1'^'e  (t.iken  from  ttie  glide  slope  and  locali/er  signals); 
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.lUitudc,  .urspcc'd,  and  vt-rtical  spied  (obt  .lined  from  the  ,iL  rer.if  I ' 5 pressure 
.ind  static  air  systems);  vertii'.il  aeee  ler.it  i on  (t.iken  from  .in  aeee  lerom<-t  er 
loe.ited  lUMr  the  center  of  fjr.ivity  of  t tie  .lirer.ifl);  lie.idinp  devi.it  ions 
(t.ikc-n  from  a remote  qy  ro-st  .ibi  I i /ed  comp.iss);  .ind  control  vifieel  d.it.i 
(derived  from  mecb.inoeh'ct  ric  t r.insdiicers  connected  to  ttie  .lircr.ift's  control 
catiles).  (vent  sicpi.ils  were  inserted  on  .1  seji.ir.ile  d.ita  cti.innel  by  ttie  use 
of  a manual  switch.  O.ita  were  rei'orded  st.irtifiq  ,it  Itie  lieqinniiif)  of  ttie 
approacti  at  ttie  outer  marker  .ind  endini)  wtien  ttie  runw.iy  ttirestiold  w.is  crossed 
at  an  altitude  of  100  feet;  .it  tti.it  point  ttie  suti  ject  w.is  instructed  to 
increase  power  and  qo  .iround.  tio  differences  were  found  between  ttie  displ.iys 
tnit  ttie  more  experienced  pilots  of  ttie  qroup  (.in  aver.iqe  of  l,.'’f>7  tiours  of 
instrument  time)  maintained  a sm.ill,  si  (jni  f i c.int  superiority  on  tioldinq  to 
ttie  qlide  slope  between  ttie  outer  .ind  middle  m.irkers  <is  comp.ired  to  .1  less 
experienced  qroup  (an  averaqe  of  104  tiours  of  instrunx-nt  tim<‘).  Ttuis, 
aittiouqti  Hasbrook  et  al.  st.ited  ttiat  ttie  pilots  q^•ner.lll\  r.ited  ttie  PVt  D as 
good  to  excellent,  ttie  PVl  0 displ.iy  conf  ifjur.it  ion  did  not  result  in 
statistically  superior  perform. ince. 

llillinqs,  Gerke,  and  Wick  (fi)  did  <1  study  tti.it,  ttioutjti  it  did  not  involve 
manipulation  of  workload,  is  of  interf'St  tiec.iuse  it  involved  botti  iri-fliqtit 
and  simulator  performance.  Ttie  variatilf*  of  interest  w.is  ttie  dos.iqe  of  sodium 
secobarbital  (0,  100,  or  ?00  m<j).  Ttie  in-flifjtit  portion  of  ttie  study  was 
carried  out  tiy  using  a snecially  instrumented  ('essn.i  17,’;  ttie  simulator  ji.irt 
of  the  study  used  a GAT-1  simulator,  for  botti  ttie  .lircr.ift  .ind  ttie  simulator 
data  were  recorded  in  digital  form.it  .it  .1  s.imjilinfj  r.ite  of  .’b  11/  to  yield 
measures  of  average  absolute  error  in  tioldincj  to  ttie  locali/er,  giide  ji.itti, 
and  commanded  airspeed  (100  mjiti);  root -mean-sfjuare  (ItflS)  error  w.is  flf'rived  by 
appropriate  computational  procedures  for  e.icti  of  ttie  v.iri.itiles.  Ttie  five 
"tiigtily  experienced  professional  pilots"  wtio  served  in  ttie  study  stiowed  .1 
small,  nonsif  ni  ficant  over.ill  incre.ise  in  error  .icross  tti(>  six  .lircr.ift 
fligtits  (averaged  over  drug  conditions)  .ind  .1  slifjtitly  l.irfjer,  signi  f leant 
decrease  in  error  over  ttie  six  simulator  flifjtits  (.Hj.iin  .iver.ifjcfl  over  drufj 
effects ) . It  is  interesting  to  note  tti.it  wtiere.is  .ill  of  ttie  six  st.itistic.il 
tests  carried  out  on  the  simulator  data  stiowed  a sifjni  f ic.int  drufj  effect, 
only  four  of  the  six  tests  on  ttie  aircraft  fiata  stiowed  ttie  drufj  effect  to  tie 
significant.  In  addition,  for  all  segments  of  ttu*  appro. icti  ttie  no-drug 
(placebo)  condition  was  best  in  ttie  simulator,  and  for  all  but  one  seejment 
Uie  ]00-mg  dose  resulted  in  better  performance  ttian  did  ttie  POO-mij  dose  in 
ttie  simulator.  Ttie  analogous  results  were  mixed  in  ttie  c.ise  of  ttie  aircraft 
data.  On  all  ttiree  measures  (qlide  slope,  locali/er,  and  airspeed)  ttu'  HITS 
variability  was  Jess  in  ttie  simulator  ttian  in  the  aircr.ift;  and  for  only  one 
absolute  measure  (deviation  from  comm.ind  airspet'd  at  ttie  ?00-m(j  dose)  was 
performance  in  the  simulator  nunx'ri ca  1 ly  poorer  ttian  in  ttie  aircr.ift.  01  rect 
statistical  comparisons  between  simulator  and  .lircraft  were  not  reportri); 
perhaps  they  were  not  feasible. 
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B.  Ixtcrn.ilK  B.iscd  Mt'.isurcs.  Brictson,  Cidvarelli,  and  Wulfeck  (7) 
(icsrrihc  <i  systi-m  lliat  Sas  been  list'd  Lo  assess  the  quality  of  aircraft 
carrier  .H)pro<H'hes  .in<i  laiidintis.  llie  worklo.id  variations  were  tiiose 
associated  wiUi  niqlit  versus  day  1 .indi  lujs . Ihe  procedure  for  recording  the 
final  apiiroat'h  performance  involved  <i  stiiphoard  instrumentation  system 
consist  iruj  of  twin  precision  radars  and  a si()nal  data  recorder  that  provided 
up  to  ei(}ht  channels  of  continuous  flight  information.  The  range  error  was 
reported  to  Ih'  on  tlie  order  of  4 foet  and  the  .ingular  error,  on  the  order  of 
0.3  milliradian.  Bange,  true  altitude,  altitude  error,  lateral  error,  sink 
speed,  true  air  speed,  di'ck  pitch,  and  closing  speed  were  the  variables 
usually  recorded.  Among  other  findirujs,  Brictson  et  al.  reported  tliat 
altitude  errors  were  greater  <)t  night  ttian  during  tlie  day  with  a greater 
tendency  for  the  approach  to  be  below  glide  slope  at  nicjht.  Brictson  et  al. 
also  report  tliat  a reasonably  good  measure  of  the  quality  of  the  approach 
and  landing  was  obtained  by  simply  noting  which  of  the  four  arresting  wires 
w.is  tiooked  and  the  number  of  "bolters"  (no  arresting  wire  engaged).  The 
major  difference  in  ttie  tasks  of  nigtit  versus  day  landing  was  in  the 
i mpover i stimenl  of  the  visual  field  in  terms  of  details  of  the  carrier  and  the 
texture  of  the  water.  Mot  haviiuj  those  cues  made  ttie  task  more  difficult, 
and  Brictson  et  al.  wc>re  able  to  develop  differential  criteria  for  predicting 
successful  landings  at  nicjht  versus  duriruj  the  day  for  various  departures 
from  the  o()timum  apjiro.ich  con f i gur.it ion . 

VII.  Discussion,  RecomnK'nd.i  t Ions  , Cautions,  .ind  Conclusions. 

A.  A Hypothetical  llese.irch  Vehicle.  Let  us  assume  that  there  exists  a 
re<il  aircraft  system  with  the  following  capab i 1 i t ies : (1)  An  exact 

assignment  of  the  nciturc  and  numtier  of  (jilot  duties  or  activities  can  be 
m.tde  for  <iny  given  mission  ph.ise.  (?)  It  is  possible  to  vary  those  duties 
sinfjly  or  in  comtiination  over  time.  (3)  Control  and  display  characteristics 
c.in  f)e  m.jnipul.ited  at  will.  (4)  I’recise  and  reliable  quantitative  indices  of 
the  task  demands  placed  on  the  pilot  by  the  system  are  availat)le  for  all  task 
elemf.'fits.  (1))  Precise  and  rcli.ible  quantitative  measures  of  the  skill  with 
which  thf  pilot  meets  thos<i  demands  <)re  available.  (C>)  An  adequate  criterion 
me.isure  of  system  performance  is  .ivailable. 

What  kinds  of  information  might  we  expect  to  be  able  to  develop  as 
rj-gards  pilot  workload  througb  use  of  such  a system?  First,  as  we  add  tasks 
in  different  comtiinat  i ons , we  should  Ix'  able  to  determine  the  priorities 
ttic*  pilot  assi()ns  to  the  different  tasks  and  whether  these  priority 
assignments  are  consistent  across  pilots;  as  ttie  numtier  of  actions  required 
per  unit  of  time  .ipproaches  <ind  exceeds  ttie  tinx'  availal)le,  or  as 
simul  t .ifK'ous  demands  for  action  arise,  some  tasks  will  l)e  given  less 
at  tent  1 on  witti  a result.jnt  lowerimj  of  performance  on  ttiose  tasks.  Second, 
we  stiould  l)e  .ible  to  determiru-  tmw  ttie  different  elemt^nts  of  the  pilot's  job 
interact;  as  different  tasks  are  added  to  ttie  total  workload,  do  some  tasks 
tend  to  int.erfer*'  witti  ttie  perform.ince  of  ottier  tasks?  And,  ttiird,  we  should 
lx-  .d>le  to  det<'rmine  wli.it  kinds  of  tasks  or  performance  functions  are  most 
sensitive  Xj>  v.irlations  in  total  dem.ind. 


In  .1  simi  l.ir  m.jtwu'r,  for  <i  (|ivcn  t.isk  lo.id  on  our  .is5un»-(l  systom,  we 
stiouhl  lx*  .ihlr  to  (let  frmin<-  ttir  relative  sensitivity  of  the  different, 
per f orm.inee  dem.tt\ds  to  various  envi  rortmetila  1 arifl  procedur.il  factors.  We 
should,  in  ttiis  somewhat  differ<‘(\t  cf>ntext  , aij.un  se<’  whicli  tasks  are  piven 
priority.  And  we  should  he  .ihle  to  acquire  itiform.it  ion  on  the  relative 
import  anc'e  of  "o|)er.itor  style"  in  system  perform.iru'e. 

from  system.it  ic  studies  of  t.tsk  characteri  st  i cs  , t .isk  comi)  i n.i  t i ons  , and 
procedur.il  f.ictors,  we  should  he  .ihle  to  develop  <i  (juant  i t at  i ve  concept  of 
worklo.id  c.ip.K'ity  or--. is  some  prefer  to  c.ill  it--ch.innel  c.ip.icity.  Thus,  we 
stiould  tie  alilc  to  .irrive  .it  .1  notion  of  workload  for  a tjiven  mission  pti.ise  as 
involvint)  some  portion  of  tlie  pilot's  total  moment  - 1 o- moment  cap.icity  to 
s.itisfy  ttie  system  dem.inds. 

Ilnforlun.itt'ly , ttiere  appe.ir  to  he  no  inst.inces  in  wtiicti  ,1  system  or  .1 
simul.ited  system  ti.is  been  sutijected  to  ttiese  sorts  tif  mani  pulat  i ons  in  any 
kind  of  proijram.it  i c .itt.ick  on  ttie  n.iture  of  pilot  worklo.id.  (Mttioucjti 
somettiirui  like  this  ti.is  tieen  done  w i t ti  synttietic  work  t.isks,  the  programs 
ti.ive  not  tieen  as  complt'te  or  <is  system.it  ic  as  would  tie  desiratile,  and  Ltie 
results  .ire,  ttierefore,  of  more  relev. ince  to  environment.il  .inti  procedural 
v.iri.itiles  ttian  to  worklo.id  per  se  (cf.  Chiles  el  > 1':  Mluisi,  .?).) 

However,  we  c.in , pertiaps,  m.ike  sotiK-  em(ii  ric.il  ly  ti.ised  projections 
(educ.iti-d  (juesses)  .is  to  wti.it  some  of  ttie  (iroducts  of  sucti  .1  protjram  miqtit 
tie.  lirst,  we  would  surely  find  ttiat.  sorw-  tasks  will  be  (jiven  priority. 

Wtiicti  ones  will  depend  on  tr.iinint)  and  ttie  perct'ived  critic.ility  of  ttie  task 
to  ttie  s.ifety  of  ttie  system  .ind  to  ttie  proti.iti  i 1 i ty  of  mission  accompl  i stiment . 
lor  ex.imjile,  ILS-type  qiiid.ince  inform.it  ion  will  lie  (|iven  very  tiiqti  priority 
during  very  low  visitiility  .ipfiro.ich  conditions;  .ind  tliere  is  re.ison  to 
tielieve  tti.it  soim*  of  ttie  instruiTHTits  are,  on  occ.ision,  given  too  low  .1 
priority  after  tireakout  witti  potent  i.illy  dis.istrous  results. 

Anottier  predi  ct  .iti  le  result  is  tti.it  ttie  elements  of  m.tny  comti  i nat  i ons  of 
tasks  will  be  found  to  tie  non.iddi  t i ve  (in  ttie  simiilest  me.inint)  of  ttiat  term). 

At  tiigli  levels  of  pilot  skill  at  time  stiaring,  .1  niimlier  of  tasks  can  .ipparently 
tie  performed  wittiout  evidence  of  decrements  or  cross  interference.  However, 
wliere  t.isks  present  conflicting  dem.inds,  ttie  lack  tif  additivity  m.iy  take  on  a 
mucb  different  ctiaracter;  ttie  specific  effects  will  largely  depeuid  on  ttie 
required  s.implint)  r.ite  for  ttie  different  inform.ilion  sources  coupled  witti  tbe 
required  "dwell  times";  i.e.,  Iiow  long  it  t.ikes  ttie  pilot  to  extract  ttie 
necessary  inform.it ion.  Peril. ips  ttie  most  important  single  factor  in  ttiis  area 
Is  the  degree  of  freedom  ttie  pilot  can  exercise  as  to  e*x.ict  ly  wlien  v.irious 
actions  must  be  initl.ited. 

If  ttie  sugg»*sted  proijr.im  were  to  tie  c.irri<*d  far  eiimujti , it  would  probably 
develop  tti.it  only  a limited  niimlier  of  oper.itor  styles  will  enn-rge  ttiat  will 
allow  or  Insure  overall  sat  I sfact  ion  of  ttie  system  di>m.inds. 


And,  fin.illy,  it  will  l>c  only  c)ft<T  sul)S  t <im  t i d 1 .iful  thoroucjli  rcsf.ircti 
Ui.it  t tie  ()u.int  i t dt  i ve  mcttiods  will  yield  roddily  usedhle  indices  thdt  reldte 
directly  to  "how  hdrd  the  pilot  h.is  to  work"  with  <i  (jiven  system  worklo.id 
conf  i (]iir.it  ior>. 

Itie  fdct  thdt  ttiese  .ihove-menti oned  "ediicdted  fpjesses"  dre,  for  Uie  most 
p.trt  , rdttier  obvious  should  not  he  dllowed  to  detract  from  the  cleor 
des  i r.ih  i 1 i ty  of  <ittem()t  iruj  their  empirical  verification.  Perhaps  on  such  a 
"ti.ire  tiones"  kind  of  outline  a general  theory  of  workload  could  be  developed. 

B.  Choosing  a Method.  Tlte  first  and  foremost  factor  to  keep  in  mind  in 
ctioosing  .i  methodology  in  attacking  some  particular  workload  guestion  is  the 
purpose*  or  goal  of  the  rc'search.  Ttiis  is  true  whether  we  are  choosing  from 
amofig  the  kinds  of  me*thods  discussed  here  or  from  among  those  discussed  in 
some*  othe*r  stuely. 

The  [)rim.iry  thine)  to  keep  in  minel  is  that  the  measures  being  taken  should 
<il  low  the  de*te*ctiore  of  operational  ly  important  change*s  in  the  pilot's  ability 
to  satisfy  system  de*mands  <)S  a function  of  the  workload  variables  being 
marupulated.  If  a give*n  measure  or  pattern  of  me*asures  were*  to  reveal 
ele*cre*me*nt s for  one  configuration  of  system  demands  in  relation  to  anotfier 
cejtif  i()urat  ion,  the*  de*cre*me*nts  should  be  me*af\ingful  ly  relatable  to  critical 
ope-rat  i on.)  1 t.jsks  in  terms  of  pilot  reliability,  system  safety,  and/or 
prob.ebility  of  mission  succe*ss.  A1  te*rnati  ve*ly , (and  this  is  much  more 
difficult  te>  est.iblish)  if  no  ele*cre*me*nt s are*  found  for  a given  workle_)ad 
cofif  i()ur.)t  i oti , it  stienild  be*  clearly  possible  to  predict  that  the  pilot  could 
satisfy  the  syste*m  de*mands  uniler  operational  conditions.  At  the  same*  time*, 
e*very  (K)ssible*  effort  (within  reason  and  the*  scope  of  available  resources) 
should  be*  m.)de*  to  de*sign  the  research  so  that  maximum  generality  across 
syste*ms  is  ()ossible.  Clearly,  when  we  choose  a me*thod  and  select  the 
variables  that  are  to  be  me*.)sured  (the  dependent  variables),  we  are  committing 
ourselve*s  to  a particular  re*alm  of  discourse  as  regards  system  workload 
parame*ters.  Thus,  we  must  be  certain  that  the  basic  problem  that  gave  rise 
to  the  research  can  in  fact  be  handled  within  that  realm  of  discourse,  (The 
im))ortance  of  ttu*  selection  of  dependent  variables  has  been  dealt  with  in 
soiTK*  detail  by  Chapanis,  11;  by  Alluisi,  );  atul  t)y  Chiles,  l.’,U.) 

Th<*  most  pressing  and  the  most  difficult  problem  in  assessing  workload 
effects  (whatever  fw*thod  is  chosen)  lies  in  the  development  of  reliable, 
guanti  tati  v<*  crileri.t  that  validly  reflect  system  performance.  )Ve  need 
criteria  against  which  to  evaluate  the  results  of  our  research.  We  must  be 
abl(*  to  dlstinguisli  acceptable  from  unacceptab le , good  from  acceptable,  and 
f*xcellent  from  good  pf*rform.ince  of  the  system.  We  must  be  able  to  make  these 
distinctions  guantitatively  and  reliably.  And  we  must  be  able  to  disentangle 
pilot  (»erformanc<* , m.ichlne  performance,  and  pilot-machine  performance. 

Ill  tim.it«*ly , w(*  want  •)  m<*thod  with  which  it  wotjld  be  possible  to  assign 
reli.ible  viri.ince,  ,)s  appropriate,  to  the  man,  to  the  machine,  and/or  to  the 
man-m.)chine  interface. 
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(or  sotTK'  s(M-oific  qiK'Stions  Uiis  m,iy  ,i|)|)(“.ir  to  lx*  .)  doccpt  i vo  I y 
opprodchiil)  lo  cpicstion.  (or  exiimplf,  if  we  ru'ci)  to  dotermi  nc  wtiicti  of  two 
instninvnl  l.inditx)  systems  m<ikes  tlie  sm.iller  root  rihutiori  to  pilot  worklo.xl, 
we  could  simply  secure  .u'cur.ite  medsures  of  tlx*  deviatiofi  of  tlx*  .jircruft  from 
tlx*  qlide  slope  <ind  the  loc.ili/er  dfxl  perti.ips  monitor  ciirspeed.  fomfj.irison 
of  tlx*  values  of  ttx'se  m<*dsures  for  tlx*  two  displays  should  (jive  us  an  index 
of  their  work  load- inducirxj  properties.  However,  it  is  entirely  conceivable 
ttiat  one  display  would  lead  to  smaller  errors  only  because  the  pilot  could, 

()y  workinq  harder,  take*  .idvantaqe  of  somr*  peculiarity  of  that  disjjlay  in 
holditxj  to  ttie  propf'r  course;  at  the  same  time*,  tlxj  pilot  miqht  very  well  lx* 
less  able  to  re*sporxi  ap;)ropriat(!ly  to  some  eme;rqency  condition  that  miqht 
arise  from  some*  ottx*r  quarter.  Thus,  in  this  specific  example*,  we  would  rxx*d 
to  add  a variable  that  would  stu,*d  liqht  on  how  meich  of  the  pilot's  workload 
capacity  was  he*imj  use^el  up  by  e*ach  display.  In  our  hypothetical,  completely 
flexible  aircraft  system,  we*  coulel  introduce*  some*  sort  of  malfunctieen  that, 
conceivably,  could  he  tianelled  readily  with  the  othe*rwise  poorer  elisplay  hut 
only  with  consider.)!) le*  elifficulty  in  the*  c.ise*  of  the*  "hette*r"  displ.iy.  This 
is  admitteelly  a hiqhly  artificial  example  and  the*  intent  is  merely  to  suqeje-st 
.1  possible  way  in  which  wh.it  miejht  .ippear  to  he*  a simple*  me'asure*me*nt  preihlem 
miqht  not  he*  so  easy  afte*r  all.  The*  ot(ie*r  intent  in  int  rexlucine;  the  example* 
is  to  suqqest  that  when  we  draw  a conclusion  based  on  a particular  set  of 
me*asure*s,  the  results  m.iy  imply  extrapolat  ions  well  beyond  the  ci  rcums  t ances 
under  whici)  the  m«*<isure*mi'nts  were*  m.ide*.  (ItemeTnlier,  analoqies,  .is  well  as 
ex.imples,  sfiould  not  be*  pusheel  too  far.) 

The  me*asurement  aex)  analysis  .ipproach  de*scribe*e1  by  f'ejt  t e*rm.i;i  .ind  Wood  (?1) 
in  their  evaluation  of  performance*  in  ,i  space*  ve*hicle*  simulator  .ippe*ars  to 
she>w  considerable  promise*  ,is  a te*chnique  for  conve*rtitxi  "r.iw"  pe*r form.inc'e* 
me*.isurements  to  probabilities  of  me‘e*tin(|  crite*rion  re*(pii  re*me*nts.  Howe*ver, 
there  is  a gap  between  their  application  and  the*  typical  pilot  workload 
measurement  situation.  Specifically,  in  the*  e*.i  of  the*  1 un.ir  (xcursion 
Module,  ttx*  maximum  v.i1ik*s  of  various  p.ir.ime*t e*rs  c.in  be*  spe*cifie*d  quite? 
readily;  for  example,  enqineerinq  spe*ci  f icat  ions  dict.ite*  th.it  the*  impact 
velocity  of  the  velticle  ejn  landine;  cannot  e*xe*e*e*d  some  v.iIih*  without  risk  of 
elam.ifje*.  Such  precision  is  less  cle*arly  i dent  i f i .ib  1 e*  in  the*  m.ijority  of 
aircraft  operatinq  situations;  typic.illy,  r.ithe*r  tuo.id  l.ititude*  is  possible 
in  the  fliejht  parame*te,*rs  without  risk  of  enterinej  uns.ife*  conditions  of  flight. 
Thus,  in  some  are?.is  tlie  .ipplic.ition  of  the*  pr()e'<*deire  to  some*  .lircraft  mission 
ptiasps  miqht  becomt*  a bit  .irbitrary.  He*rh.ips  feir  re*se*arch  purpe)se*s  it  woulel 
lie*  necess.iry  anel  proflt.ible*  to  set  up  much  more*  strine)e*nt  crite*ria  tlian 
rx)rm.il,  but  not  too  strinejent;  the*  elifficulty  of  ttx*  crite*ri.i  sheiulel  lie*  such 
ttiat  ttie  typical  pilot  from  the  population  of  pilots  to  whiedi  w»*  wish  tei 
ey?nerall/e  would,  urxier  rxirmal  corxiitions,  lx*  c.ip.ible  of  pe*rformifX) 
sat i sf actor i ly. 

Assuming  ttiat  we*  have*  ,ide*quate  crite*ri.i  of  syste*m  pe*r form.irx'e*  that  re*flect 
botti  nvin  aeid  m.jf)-m.ie?ti irx*  contributions  te>  syste*m  output,  how  do  we*  proceed? 


to 


I 


I 


The  first  step  is  the  identification  of  all  of  those  human  ant)  machine 
factors  ttiat  could  conceivahly  influence  the  variable;  of  interest.  Ttiis  list 
typically  will  tx-  unm.inat)e*ahle  from  a rese.ircti  point  of  view,  and  expert 
JudfjmeTit  , b.ised  on  knowledge  of  human  behavior  and  system  hetiavior,  will  have 
to  he  applied  to  eliminate  those  factors  of  neejliqible  or  relatively  small 
potential  impact.  Having  developed  a (presumably  manageable)  list  of 
important  factors,  we  attempt  to  phrase  (or  rephrase)  the  question  such  that 
it  hrcotTK’s  amenat)le  to  some  (as  yet  unspecified)  research  technique.  We  next 
arrange  the  relevant  factors  into  two  categories;  one  category  contains  items 
that  are  in  the  nature  of  constraints  or  boundary  conditions,  and  the  second 
category  contains  items  that  are  in  the  nature  of  possible  independent 
variables;  t)\is  second  category  will,  of  course,  include  the  factor  or 
factors  that  g.ive  rise  to  the  need  for  the  research  in  the  first  place.  Now 
we  are  ready  to  examine  the  situation  in  detail  in  order  to  make  a decision 
as  to  whiat  would  be  the  best  research  nx'tliodology  to  apply  to  the  problem. 

•\t  ttiis  point  the  available  guidelines  become  very  ambiguous  and  professional 
judgfTX'nt  must  play  a dominant  role. 

first,  we  look  at  what  are  referred  to  above  as  the  boundary  conditions; 
these  are  t(\e  fixed  aspects  of  the  operational  system  from  which  the  problem 
d«“rivcs;  ttiey  concern  factors  such  as  the  gross  weight  of  ttie  vetuclc,  its 
fli())it  range,  mission  cliaracteristics,  nuni>er  of  engines,  etc.  F ach  of  ttiese 
factors  is  evaluated  in  relation  to  tlie  question:  "Miqtit  this  factor  lx* 

reasonably  expected  to  have  an  effect  on  the  performance  in  question?"  Then 
we  examine  each  item  on  ttie  list  of  possible  independent  variables;  and 
ag.iin  we  ask  tlie  qix;stion:  "Migtit  this  factor  be  reasonably  expected  to  ttave 

an  effect  on  ttie  performance  in  question?"  Depending  on  the  pattern  of 
"yeses"  and  "nfx>s,"  we  will  tend  to  direct  our  attention  toward  one 
rrx'ttiodolo  jy  or  another. 

If,  for  example,  the  basic  problem  is  concerned  with  a perceptual 
question,  say  a visual  discrimination  in  reading  two  different  types  of  dial, 
and  kinesttietic  or  gr.ivi tational  cues  would  not  be  expected  to  play  a role, 

I fieri  perfiaps  a more  or  less  traditional  latioratory  study  might  be  appropriate. 
(We  will  refer  to  tliis  study  as  task  A.)  However,  if  the  instrument  reading 
must  be  made  wtiile  performing  some  otfier  task,  say  a two-dimensional  tracking 
task  (we  will  call  ttiis  study  task  H) , ttien  frerhaps  a part-task  simulator 
would  be  in  order.  If  ttie  performance  of  task  H may  Ire  inportantly  irifluenced 
by  ttie  insertion  of  command  inform.ition , ttien  a more  elaborate  simulation 
might  be  in  ord<‘r  (task  C).  And  if  kinesthetic  cu«>s  may  be  important,  we  may 
neer)  to  ()o  to  a moti on-typ<>  simulator  or  pertiaps  an  in-flight  evaluation. 

Finally,  we  must  select  ttie  dependent  varlatrle — the  thing  we  are  going  to 
measure.  Ttiis  may  tie  a tinx-  nx>asure:  tiow  fast  can  the  pilot  do  a task?  It 
may  tu*  an  .itisolute  error  nx-asure:  tiow  often  did  tie  hit  ttie  wrong  switch?  It 
m.iy  t)»*  a rel.jtive  error  nx'asure:  wtiat  was  his  average  deviation  from  glide 

slope?  Wtiatever  ttie  measure,  we  stioiilc  if  at  all  possifile  try  to  relate  the 
findings  back  to  system-relevant  criteria  developed  In  a manner  analogous  to 
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that  (lrscrit)f‘(i  by  Cottermdt)  and  Wood  (?1).  All  too  often,  the  ttiinq  that  is 
ebo5(*n  for  measurement  is  that  whicit  is  easiest  to  acquire  or  has  Ijeen  used 
most  often  itt  ttie  past  vvittiout  any  specific  rationale  iiaving  been  slicjwn  that 
relates  tt\e  measure  to  real-system  performance  questions. 

In  some  cases  the  results  of  ttie  study  (accuracy  of  dial  reading  in  the 
■ihove-descrihed  examtile)  may  provide  information  that  is  more  or  less  directly 
interpretat)le  in  terms  of  workload.  liut  what  if  there  is  no  change  in  any  of 
the  me.isures  as  a funcliot\  of  which  dial  is  used?  Can  we  infer  that  the  two 
dials  represent  equal  workload  contributions?  Tt>e  answer  is,  of  course,  no. 
Only  after  we  have  pushed  ttie  total  workload  to  a maximum  reasonable  and 
likely  level  and  found  no  differences  on  any  rntjasures  should  we  he  willing  to 
.issume  the  equality  of  the  two  displays.  (It  is  a peculiarity  of  statistical 
methodology  tliat  we  cannot  prove  they  are  equal.)  The  procefiure  we  use  to 
pusti  the  apparent  level  of  workload  to  a maximum  is,  <jqaifi,  a matter  of 
professional  judqrwnt.  But  it  i s an  extremely  imt)ortant  Judgment.  If 
workload  is  added  in  an  obviously  .irtificial  manrter,  especially  if  our 
subjects  are  operational  personnel,  we  may  lose  them--motivational ly  speaking. 
We  must  always  be  sure  ttiat  the  research  si tuation--be  it  laboratory, 
simulator,  or  aircraft--is  presented  in  a man/)er  sucti  that  it  will  be 
responded  to  as  a "real"  situation  as  opposed  to  a game  or  a cont r ived--and 
thus  (perliaps)  m«“aningless--exercise. 

Let  no  one  make  the  mistake  of  assuming  ttiat  tttis  process  of  choosing  a 
mettiod  is  easily  executed.  The  problems  are  many  and  the  d<-cisions  difficult. 

C.  Conclusions.  Ttie  general  approacties  ttiat  we  tiave  labeled  "laboratory 
methods"  are  probably  best  suited  to  conducting  background  n*searctt  on  more 
general  quesfons  pertaining  to  workload.  Wherever  they  are  appropriate 
they  are  ttie  method  of  ctioice  because  of  the  typically  tiiqti  degree  of  control 
possible  and  the  attendant  higti  levels  of  reliability.  Ttie  synttietic  work 
mettiod  is  especially  well  suited  to  examining  general  workload  questions 
because,  by  its  nature,  tasks  can  be  added,  removed,  and  modified  witli 
relative  ease,  and,  depending  on  ttie  overall  level  of  complexity,  large 
investiTK'nts  in  training  time  are  not  required.  The  fact  ttiat  it  does  not 
simulate  an  aircraft  is  botti  a strength  and  a weakness;  it  is  a weakness 
because  of  problems  of  generalizing  to  specific  systems;  it  is  a strength 
because,  if  the  tasks  are  well  ctiosen,  operational  subjects  can  fairly  easily 
be  convinced  to  react  to  ttie  synthetic  work  device  for  what  it  is  and  not  make 
unfavorable  comparisons  between  its  behavior  and  the  behavior  of  an  aircraft. 
Ttie  secondary  loading  task  mettiod,  especially  when  applied  in  a simulation  or 
in-flight  context,  must  be  used  witti  care.  First,  ttie  task  ttiat  is  used  to 
produce  ttie  load  increments  must  be  sometiow  (at  least  rationally)  relatable 
to  the  kinds  of  activities  it  is  presumed  to  assess  in  relation  to  the  real 
system.  Second,  the  properties  of  this  task  itself  must  be  examined;  at  a 
minimum  its  reliability  and  relation  to  other  tasks  should  be  known.  Although 
some  authors  (e.g.,  Rolfe,  4?,  and  Corklndale,  ?0)  argue  that  the  primary 
task  should  remain  unaffected  by  the  introduction  of  the  loading  task,  this 
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rurulition  .tppc.irs  to  l)c  unnecessarily  restrictive.  If  tlie  loadinq  task  is 
properly  selected  (as  noted  above)  and  contradiclcjry  results  are  obtained 
(<  primary  task  \ shows  a decremerd,  primary  task  fl  is  uncbanqed,  but  the 

loaditui  task  shows  a decrement  witli  H and  not  with  '\) , the  fir\dinqs  may  be  of 
little  relevance  to  workload  (or  channel)  capaci ty  as  a unitary  concept; 
tmwever,  if  such  results  were  not  simply  ttie  product  o1  some  uncontrolled 
eot\dition,  ttie  findiru)  would  certainly  be  of  theoretical  if  not  practical 
interest.  I’erh.ips  it  is  better  at  this  staqe  of  development  to  consider  the 
concepts  channel  ea[)ac~i  t y atid  sincjle  channe  ledness  as  beinq  iwrely  manners  of 
speakifK)  and  servinq  prim.iri ly  as  heuristic  devices.  Although  this  does  not 
ar()ue  against  the  ultim.ite  possibility  that  the  operator  is  siriqle 
channeled,  present  evidence  suggests  that  the  information-handling  capaci ty 
of  ttie  tium.in  oper.itor  is  influencenl  by  too  great  a variety  of  factors  to  try 
to  p«-rm<inent  ly  settle  the  s inqle-ctianne  1 hypothesis  at  ttiis  time,  fleturninq 
to  <ind  sliijlitly  ctianqinq  the  above  example,  if  task  A shows  more  decrem<-nt 
ttian  t.isk  H with  the  addition  of  ttie  loading  task  and  ttie  loading  task  is 
performed  t)ctter  witti  task  H ttian  witti  task  A,  we  certainly  have  learned 
sonx'ttiiru)  atiout  ttie  workload  (>roperties  of  the  tasks.  Ttie  findings,  6f 
course,  rem.iin  ambicjuous  as  regards  ctiannel  capacity. 

Ttie  .inalytic  and  ttie  synttietic  m«-ttiods  botti  appear  to  yield  reasonable 
results,  t)ut  tiotti  tectmiepjes  rest  on  relatively  fragile  data  t)ases.  With 
furttier  researcti  on  wrtiat  I would  call  tirw  stiaring  txdiavior,  or  wtiat  Wingert 
(bC)  calls  function  interlacing,  ttie  synttietic  mettiod  promises  to  be  a very 
useful  aid  in  ttie  design  of  systems  and  the  allocation  of  workload.  Ttiere 
is,  tiowever,  consi derat) le  risk  ttiat  ttie  detailed  task  information  required  to 
apply  ttie  mettiod  will  be  collected  and  stored  in  a manner  tfiat  will  tend  to 
limit  its  di  strit.'ution  and  result  in  substantial  amounts  of  unnecessary 
duplicatioti  .if  effort,  Previous  attempts  to  develop  clearing  tiouses  for  the 
inform.)!  ion  tiave  not  met  witti  noteworttiy  success. 

Simulators,  especially  ttiose  controlled  by  general  purpose  digital 
comjjuters,  tiave  ttu.-  potential  of  generating  large  amounts  of  very  useful 
inform.ition  on  workload.  However,  wtiettier  the  programs  that  resulted  in  their 
acquisition  will  .jllow  adequate  access  to  sucti  systems  for  research  purposes 
remains  to  l>c  seen.  Ikjt  even  given  adequate  access,  research  witti  simulators 
is  not  wittiout  its  protilems.  I irst,  naive  subjects  cannot  tie  expected  to 
le.irn  to  fly  in  a matter  of  a few  hours;  therefore,  for  most  purposes--or  at 
least  for  ttiose  purposes  in  wtiicti  the  full  capability  of  the  simulator  is 
used--tr.)ined  pilots  are  required  who  have  adequate  experience  witti  tliat 
simulator  and/or  ttie  aircraft  it  simulates.  Thus,  salaries  can  become  a 
signific.tnt  part  of  any  sutistantial  research  effort.  Second,  the  simulator 
is,  first  and  foremost,  designed  and  built  to  appear  to  betiave  like  the 
aircraft  it  simulates;  the  quality  of  ttie  signals  internal  to  the  simulator 
need  not  tie  very  tiicjti  to  satisfy  ttiat  requl reiw-nt.  Ttius,  especially  with  the 
older  simulators,  ttie  .iv.iilatile  signals  often  introduce  an  unacceptatily  high 
degree  of  unre  1 i <ibi  1 1 ty  in  tlx*  final  measures.  Ttiird,  because  ttie  simulator 
is  desigrx'd  to  mimic  ttie  airplane,  many  of  ttie  functions  are  interconnected 


in  such  d way  thdt  it  can  be  very  ((ifficiilt  to  sep.irate  them  oiit.  for 
ex.imple,  tf>f‘  relative  contributions  of  ttie  simulator,  present  performance  of 
interest,  concurrent  performance  that  is  not  of  direct  interest  and  the 
interactions  of  these  factors  as  sources  of  v.iriance  may  b<-  ftopelessly 
entangled.  And,  fourth,  also  because  the  simulator  is  desigr\ed  to  mimic  a 
particular  airplane,  general i /at  ion  to  other  aircraft  with  significantly 
<liffer«‘nt  characteristics  (such  as  p.inel  layout  and  operating  procedures) 
becomes  rather  difficult. 

r xcept  for  some  of  the  safety  limitations,  in-flight  methods  can  be  used 
on  virtually  any  problem  suitable  for  investigation  in  a simulator.  However, 
the  recording  of  data  of  demonstrated  reliability  is  a significant  problem. 
Generally  speaking,  aircraft  are  electrically  very  noisy,  and,  where 
magnetic  tape  recordings  are  made  (either  digitally  or  through  frequency 
modulation  tecliniques ) , substantial  programing  for  signal  "reconditioning"  is 
typically  required;  glitches  are  a constant  source*  of  annoyance  (Knoop,  33). 
Unfortunately,  no  reports  of  reliability  data  have  been  discovered  for 
in-flight  recorded  performance  measures  or  for  simulator  performance  measures. 
In  fact,  this  is  a major  technical  deficiency  in  virtually  all  the  reported 
research  using  these  two  methods.  (This  criticism  <ipplies  equally  well  to 
much  of  the  other  reported  research  related  to  the  measurement  of  workload; 
vi/,  laboratory  research.) 

SoiTK-  readers  may  be  disappointed  that  firm<*r  guidelines  have  not  been 
offered  as  to  how  to  design  and  conduct  research  on  workload  problems  in 
aviation  operations.  Those  who  are  familiar  with  the  behavioral  literature 
on  tbe  measurement  of  complex  human  perfornvince  will  understand  the  absence 
of  precise,  "cookbook"  rules  for  proceeding. 
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